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Methods of Diagnosis of Colorectal Cancer, Compositions and Methods of 
Screening for Colorectal Cancer Modulators 

5 CROSS-REFERENCES TO RELATED APPLICATIONS 

[01] This application is a continuation in part of US Patent Application 
USSN 09/663,733 filed September 15, 2000, and US Patent Application filed August 14, 
2001 USSN, not yet known, which are incorporated herein by reference in their entirety. 

lb FIELD OF THE INVENTION 

[021 The invention relates to the identification of expression profiles and the 
nucleic acids involved in colorectal cancer, and to the use of such expression profiles and 
nucleic acids in diagnosis and prognosis of colorectal cancer. The invention fiirfher relates to 
methods for identifying and using candidate agents and/or targets which modulate colorectal 

15 cancer. 

BACKGROUND OF THE INVENTION 
[03] Cancer of the colon and/or rectum (referred to as "colorectal cancer") 
are significant in Western populations and particularly in the United States. Cancers of the 
colon and rectum occur in both men and women most commonly after the age of 50. These 

20 develop as the result of a pathologic transformation of normal colon epithelium to an invasive 
cancer. There have been a number of recently characterized genetic alterations that have 
been implicated in colorectal cancer, including mutations in two classes of genes, tumor- 
suppressor genes and proto-oncogenes, with recent work suggesting that mutations in DNA 
repair genes may also be involved in tumongenesis. For example, inactivating mutations of 

25 both alleles of the adenomatous polyposis coli (APC) gene, a tumor suppressor gene, appears 
to be one of thp earliest events in colorectal cancer, and may even be the initiating event. 
Other genes implicated in colorectal cancer include the MCC gene, the p53 gene, the DCC 
(deleted in colorectal carcinoma) gene and other chromosome 18q genes, and genes in the 
TGF-P signaling pathway. For a review, see Molecular Biology of Colorectal Cancer^ pp. 

30 238-299, in Curr. Probl Cancer, Sept/Oct 1997; see also Willams, Colorectal Cancer 

(1996); Kinsella & Schofield, Colorectal Cancer: A Scientific Perspective (1993); Colorectal 
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Cancer: Molecular Mechanisms, Premalignant State and its Prevention (Schmiegel & 
Scholmerich eds., 2000); Colorectal Cancer: New Aspects of Molecular Biology and Their 
Oinical Applications (Hanski et al, eds 2000); McArdle et al. Colorectal Cancer (2000); 
Wanebo, Colorectal Cancer (1993); Levin, The American Cancer Society: Colorectal Cancer 
5 (1999); Treatment of Hepatic Metastases of Colorectal Cancer (Nordlinger & Jaeck eds., 
1993); Management of Colorectal Cancer (Dunitz et ah, eds. 1998); Cancer: Principles and 
Practice of Oncology (Devita et al, eds. 2001); Surgical Oncology: Contemporary Principles 
and Practice (Kirby et aL, eds. 2001); Offit, Clinical Cancer Genetics: Risk Counseling and 
Management (1997); Radioimmunotherapy of Cancer (Abrams & Fritzberg eds. 2000); 

10 Fleming, AJCC Cancer Staging Handbook (1998); Textbook of Radiation Oncology (Leibel 
& Phillips eds. 2000); and Clinical Oncology (Abeloff a/., eds. 2000). 

[04] Imaging of colorectal cancer for diagnosis has been problematic and 
limited. In addition, metastasis of the tumor to the lumen, and metastasis of tumor cells to 
regional lymph nodes are important prognostic factors {see, eg., PET in Oncology: Basics 

15 and Clinical Application (Ruhlmann et al. eds. 1999). For example, five year survival rates 
drop fi:om 80 percent in patients with no lymph node metastases to 45 to 50 percent in tiiose 
patients who do have lymph node metastases. A recent report showed that micrometastases 
can be detected jfrom lymph nodes using reverse transcriptase-PCR methods based on the 
presence of mRNA for carcinoembryonic antigen, which has previously been shown to be 

20 present in the vast majority of colorectal cancers but not in normal tissues. Liefers a/., JNfew 
England J. of Med. 339(4):223 (1998). 

[05] Thus, methods that can be used for diagnosis and prognosis of 
colorectal cancer would be desirable. Accordingly, provided herein are methods that can be 
used in diagnosis and prognosis of colorectal cancer. Further provided are methods that can 

25 be used to screen candidate bioactive agents for the ability to modulate colorectal cancer. 
Additionally, provided herein are molecular targets for therapeutic intervention in colorectal 
and other cancers. 

BRIEF SUMMARY OF THE INVENTION 
30 [06] The present invention provides novel methods for diagnosis and 

prognosis evaluation for colorectal cancer, as well as methods for screening for compositions 
which modulate colorectal cancer. Methods of treatment of colorectal cancer, as well as 
compositions, are also provided herein. 
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[07] In one aspect, a method of screening drug candidates comprises 
providing a cell that expresses an expression profile gene selected from those of Table 1. The 
method furttier includes adding a drug candidate to the cell and determining the effect of the 
dmg candidate on the expression of the expression profile gene. 
5 [08] In one embodiment, the method of screening drug candidates includes 

comparing the level of expression in the absence of the drug candidate to the level of 
e}q)ression in the presence of the drug candidate, wherein the concentration of the drug 
candidate can vary when present, and wherein the comparison can occur after addition or 
removal of the drag candidate. In a preferred embodiment, the cell expresses at least two 

10 expression profile genes. The profile genes may show an increase or decrease. 

[09] Also provided herein is a method of screening for a bioactive agent 
capable of binding to a colorectal cancer modulator protein, the method comprising 
combining the colorectal cancer modulator protein and a candidate bioactive agent, and 
determining the binding of the candidate agent to the colorectal cancer modulator protein. 

15 Preferably the colorectal cancer modulator protein is a product encoded by a gene of Table 1 
or Table 2. 

[10] Further provided herein is a method for screening for a bioactive agent 

capable of modulating the activity of a colorectal cancer modulator protein. In one 

embodiment, the method comprises combining the colorectal cancer modulator protein and a 
20 candidate bioactive agent, and determining the effect of the candidate agent on the bioactivity 

of the colorectal cancer modulator protein. Preferably the colorectal cancer modulator 

protein is a product encoded by a gene of Table 1 or Table 2. 

[11] Also provided is a method of evaluating the effect of a candidate 

colorectal cancer drag comprising administering the drag to a transgenic animal expressing or 
25 over-expressing the colorectal cancer modulator protein, or an animal lacking the colorectal 

cancer modulator protein, for example as a result of a gene knockout. 

[12] Additionally, provided herein is a method of evaluating the effect of a 

candidate colorectal cancer drag comprising administering the drag to a patient and removing 

a cell sample from the patient. The expression profile of the cell is then determined. Hiis 
30 method may fiirther comprise comparing the expression profile to an expression profile of a 

healthy individual. In a preferred embodiment, said expression profile includes a gene of 

Table lor Table 2. 
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[13] Moreover, provided herein is a biochip comprising one or more nucleic 
acid segments of Table 1 or Table 2, wherein the biochip comprises fewer than 1000 nucleic 
acid probes. Preferable at least two nucleic acid segments are included. 

[14] Furthemiore, a me&od of diagnosing a disorder associated with 
5 colorectal cancer is provided The method comprises determining the expression of a gene of 
Table 1 or Table 2, in a first tissue type of a first individual, and comparing the distribution to 
the expression of the gene firom a second normal tissue type firom the first individual or a 
second unaffected individual. A difference in the expression indicates that the first individual 
has a disorder associated witbi colorectal cancer. 

10 [15] In another aspect, the present invention provides an antibody which 

specifically binds to a protein encoded by a nucleic acid of Table 1 or Table 2 or a firagment 
thereof Preferably the antibody is a monoclonal antibody. The antibody can be a fi:agment 
of an antibody such as a single stranded antibody as fiirther described herein, or can be 
conjugated to another molecule. In one embodiment, the antibody is a humanized antibody. 

15 [16] La one embodiment a method for screening for a bioactive agent 

capable of interfering with the binding of a colorectal cancer modulating protein (colorectal 
cancer modulator protein) or a firagment tiiereof and an antibody which binds to said 
colorectal cancer modulator protein or fragment thereof In a preferred embodiment, the 
method comprises combining a colorectal cancer modulator protein or firagment thereof, a 

20 candidate bioactive agent and an antibody which binds to said colorectal cancer modulator 
protein or firagment thereof The method fiirther includes determining the binding of said 
colorectal cancer modulator protein or firagment thereof and said antibody. Wherein there is 
a change in binding, an agent is identified as an interfering agent. The interfering agent can 
be an agonist or an antagonist. Preferably, the agent inhibits colorectal cancer. 

25 [17] In a fiirther aspect, a method for inhibiting colorectal cancer is 

provided. The method can be performed in vitro or in vivo, preferably in vivo to an 
individual. In a preferred embodiment the method of inhibiting colorectal cancer is provided 
to an individual witii cancer. As described herein, methods of inhibiting colorectal cancar 
can be performed by administering an inhibitor of the activity of a protein encoded by a 

30 nucleic acid of Table 1 or Table 2, including an antisense molecule to the gene or its gene 
product. 

[18] Also provided herein are methods of eliciting an immune response in 
an individual. In one embodiment a method provided h^ein comprises administering to an 
individual a composition comprising a colorectal cancer modulating protein, or a firagm^t 
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thereof. In another mibodiment, the protein is encoded by a nucleic acid selected from those 
of Table 1 or Table 2. In another aspect, said composition comprises a nucleic acid 
comprising a sequence encoding a colorectal cancer modulating protein, or a fragment 
thereof. 

5 [19] Further provided herein are compositions capable of eliciting an 

immune response in an individual. In one embodiment, a composition provided herein 
comprises a colorectal cancer modulating protein, preferably encoded by a nucleic acid of 
Table 1 or Table 2, or a fragment thereof, and a pharmaceutically acceptable carrier. In 
another embodiment, said composition comprises a nucleic acid comprising a sequence 

10 encoding a colorectal cancer modulating protein, preferably selected from the nucleic acids of 
Table 1 or Table 2 and a pharmaceutically acceptable carrier. 

[20] Also provided are methods of neutralizing the effect of a colorectal 
cancer protein, or a fragment thereof, comprising contacting an agent specific for said protein 
with said protein in an amount sufficient to effect neutralization. In another embodiment, the 

1 5 protein is encoded by a nucleic acid selected from those of Table 1 or Table 2. 

[21] In another aspect of the invention, a method of treating an individual 
for colorectal cancer is provided. In one embodiment, the method comprises administering to 
said individual an inhibitor of a colorectal cancer modulating protein. In another 
embodiment, the method comprises administering to a patient having colorectal cancer an 

20 antibody to a colorectal cancer modulating protein conjugated to a thers^eutic moiety. Such 
a therapeutic moiety can be a cytotoxic agent or a radioisotope. 

[22] Compounds and compositions are also provided. Other aspects of the 
invention will become apparent to the skilled artisan by the following description of the 
invention. 

25 BRIEF DESCRIPTION OF THE DRAWINGS 

[NOT APPLICABLE] 

DETAILED DESCRIPTION OF THE INVENTION 
[23] The present invention provides novel methods for diagnosis and 
30 prognosis evaluation for colorectal cancer, as well as methods for screening for compositions 
which modulate colorectal cancer. The methods herein are related to those of U.S. Patent 
Application Serial No. 09/525,993 and International Patent AppUcationNo. 
PCT/USOO/07044, each of which is incorporated herein in its entirety. 
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[24] By "colorectal cancer'' herein is meant a colon and/or rectal tumor or 
cancer that is classified as Dukes stage A or B as well as metastatic tumors classified as 
Dukes stage Cor D {see, e.g., Cohen et aL, Cancer of the Colon, in Cancer: Principles and 
Practice of Oncology, pp. 1 144-1 197 (Devita et aL, eds., 5* ed. 1997); see also Harrison 's 
5 Principles of Internal Medicine, pp. 1289-129 (Wilson et aL, eds., 12* ed., 1991). 

^Treatment, monitoring, detection or modulation of colorectal cancer^' includes treatment, 
monitoring, detection, or modulation of colorectal disease in those patients who have 
colorectal disease (Dukes stage A , B, C or D) in which gene expression fi"om a gene in Table 
1 or 2, is increased or decreased, indicating that the subject is more likely to progress to 

10 metastatic disease than a patient who does not have an increase or decrease in gene 

expression of a gene in Table 1 or 2. In Dukes stage A, the tumor has penetrated into, but not 
through, the bowel wall. In Dukes stage B, the tumor has penetrated through the bowel wall 
but there is not yet any lymph involvement. In Dukes stage C, the cancer involves regional 
lymph nodes. In Dukes stage D, there is distant metastasis, e.g., liver, lung, etc. 

1 5 [25] Table 1 provides unigene cluster identification numbers for tiie 

nucleotide sequence of genes that exhibit increased expression in colorectal cancer samples. 
Tables 1 also provides an exemplar accession number that provides a nucleotide sequence 
that is part of the imigene cluster. Table 2 provides the nucleic acid and protein sequence of 
the CBF9 gene as well as the Unigene and Exemplar accession numbers for CBF9. 

20 [26] Iq one aspect, the expression levels of genes are determined in 

different patient samples for which either diagnosis or prognosis information is desired, to 
provide expression profiles. An expression profile of a particular sample is essentially a 
"fingerprint" of the state of the sample; while two states may have any particular gene 
similarly expressed, the evaluation of a number of genes simultaneously allows the 

25 generation of a gene expression profile that is imique to the state of the cell. That is, normal 
tissue may be distinguished firom colorectal cancer tissue, and within colorectal cancer 
tissue, different prognosis states (good or poor long term survival prospects, for example) 
may be determined. By comparing expression profiles of colon tissue in known different 
states, information regarding which genes are important (including both up- and down- 

30 regulation of genes) in each of these states is obtained. The identification of sequences that 
are differentially expressed in colorectal cancer versus normal colon tissue, as well as 
differential expression resulting in different prognostic outcomes, allqws the use of this 
information in a number of ways. For example, the evaluation of a particular treatment 
regime may be evaluated: does a chemotherapeutic drag act to improve the long-term 
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prognosis in a particular patient. Similarly, diagnosis may be done or confirmed by 
comparing patient samples with the known expression profiles. Furthermore, these gene 
expression profiles (or individual genes) allow screening of drag candidates with an eye to 
mimicking or altering a particular expression profile; for example, screening can be done for 
5 drugs that suppress the colorectal cancer expression profile or convert a poor prognosis 

profile to a better prognosis profile. This may be done by making biochips comprising sets of 
the important colorectal cancer genes, which can then be used in these screens. These 
methods can also be done on tiie protein basis; that is, protein expression levels of the 
colorectal cancer proteins can be evaluated for diagnostic and prognostic purposes or to 

10 screen candidate agents. In addition, the colorectal cancer nucleic acid sequences can be 
administered for gene therapy purposes, including the administration of antisense nucleic 
acids, or the colorectal cancer proteins (including antibodies and other modulators thereof) 
administered as therapeutic drugs. 

[27] Thus the present invention provides nucleic acid and protein 

1 5 sequences that are differentially expressed in colorectal cancer, herein termed "colorectal 
cancer sequences'*. As outlined below, colorectal cancer sequences include those that are 
up-regulated (i.e. expressed at a higher level) in colorectal cancer , as well as those that are 
down-regulated (i.e. expressed at a lower level) in colorectal cancer . In a preferred 
embodiment, the colorectal cancer sequences are from humans; however, as will be 

20 appreciated by those in the art, colorectal cancer sequences from other organisms may be 
useful in animal models of disease and drag evaluation; fhus, other colorectal cancer 
sequences are provided, from vertebrates, including mammals, including rodents (rats, mice, 
hamsters, guinea pigs, etc.), primates, farm animals (including sheep, goats, pigs, cows, 
horses, etc), colorectal cancer sequences from other organisms may be obtained using the 

25 techniques outiined below. 

[28] Colorectal cancer sequences can include both nucleic acid and amino 
acid sequences. In a preferred embodiment, the colorectal cancer sequences are recombinant 
nucleic acids. By the term "recombinant nucleic acid" herein is meant nucleic acid, originally 
formed in vitro, in general, by the manipulation of nucleic acid by polymerases and 

30 endonucleases, in a form not normally found in nature. Thus an isolated nucleic acid, in a 
linear form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will rq)licate non-recombinantiy, i.e. using the in vivo cellular machinery of the 
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host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently repUcated non-recombinantly, are still considered 
recombinant for the purposes of the invention. 

[29] Similarly, a "recombinant protein" is a protein made using recombinant 
5 techniques, i.e. through the expression of a recombinant nucleic acid as depicted above. A 
recombinant protein is distinguished from naturally occurring protein by at least one or more 
characteristics. For example, the protein may be isolated or purified away from some or all 
of the proteins and compounds with which it is normally associated in its wild type host, and 
thus may be substantially pure. For example, an isolated protein is unaccompamed by at least 

10 some of the material with which it is normally associated in its natural state, preferably 
constituting at least about 0.5%, more preferably at least about 5% by weight of the total 
protein in a given sample. A substantially pure protein comprises at least about 75% by 
weight of the total protein, with at least about 80% being preferred, and at least about 90% 
being particularly preferred. The definition iacludes the production of a colorectal cancer 

15 protein from one organism in a different organism or host ceil. Alternatively, the protein may 
be made at a significantly higher concentration than is normally seen, through the use of an 
inducible promoter or high expression promoter, such that the protein is made at increased 
concentration levels. Alternatively, the protein may be in a form not normally found in 
nature, as in the addition of an epitope tag or amino acid substitutions, insertions and 

20 deletions, as discussed below. 

[30] In a preferred embodiment, the colorectal cancer sequences are 
nucleic acids. As will be appreciated by those in the art and is more fully outlined below, 
colorectal cancer sequences are usefril in a variety of appUcations, including diagnostic 
applications, which will detect naturally occurring nucleic acids, as well as screening 

25 applications; for example, biochips comprising nucleic acid probes to the colorectal cancer 
sequences can be generated. In the broadest sense, then, by "nucleic acid" or 
"oligonucleotide" or grammatical equivalents herein means at least two nucleotides 
covalently linked together. A nucleic acid of the present invention will generally contain 
phosphodiester bonds, although in some cases, as outlined below, nucleic acid analogs are 

30 included that may have alternate backbones, comprising, for example, phosphoramidate 
(Beaucage et al, Tetrahedron 49(10): 1925 (1993) and references therein; Letsinger, J. Org. 
Chem. 35:3800 (1970); Sprinzl et al., Eur. J. Biochem. 81:579 (1977); Letsinger et al., Nucl. 
Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 (1984), Letsinger et al., J. Am. 
Chem. Soc. 110:4470 (1988); and Pauwels et al„ Chemica Scripta 26:141 91986)), 
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phosphoiothioate (Mag et al.. Nucleic Acids Res. 19:1437 (1991); and U.S. Patent No. 
5,644,048), phosphoiodithioate (Briu et al., J. Am. Chem. Soc. 1 1 1:2321 (1989), O- 
methy^hophoioamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical 
Approach, Oxford University Press), and pepMe nucleic acid backbones and linkages (see 
5 Egholm, J. Am. Chem. Soc. 1 14:1895 (1992); Meier et al., Chem. Int. Ed. Engl. 31:1008 
. (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al.. Nature 380:207 (1996), all of which 
are incorporated by reference). Other analog nucleic acids include those widi positive 
backbones (Denpcy et al., Proc. Natl. Acad. Sci. USA 92:6097 (1995); non-ionic backbones 
(U.S. Patent Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,14rand 4,469,863; Kiedrowshi et 

10 al., Angew. Chem. Intl. Ed. English 30:423 (1991); Letsinger et al., J. Am. Chem. Soc. 

110:4470 (1988); Letsinger et al.. Nucleoside & Nucleotide 13:1597 (1994); Chapters 2 and 
3 , ASC Symposium Series 580, "Carbohydrate Modifications in Antisense Research", Ed. 
Y.S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal Chem. Lett. 4:395 
(1994); Jeffe et al., J. BiomolecularNMR 34:17 (1994); Tetrahedron Lett. 37:743 (1996)) and 

1 5 non-ribose backbones, including those described in U.S. Patent Nos. 5,235,033 and 

5,034,506, and Ch^ters 6 and 7, ASC Symposium Series 580, "Carbohydrate Modifications 
in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids containing one or 
more carbocyclic sugars are also included within one definition of nucleic acids (see Jenkins 
etal., Chem. Soc. Rev. (1995) ppl69-176). Several nucleic acid analogs are described in 

20 Rawls, C & E News June 2, 1997 page 35. All of these references are hereby expressly 
incorporated by reference. These modifications of the libose-phosphate bacld)one may be 
done for a variety of reasons, for example to increase the stability and half-life of such 
molecules in physiological environments or as probes on a biochip. 

[3 1] As will be appreciated by those in the art, all of these nucleic acid 

25 analogs may find use in the present invention. In addition, mixtures of naturally occurring 
nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid 
analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. 

[32] Particularly preferred are peptide nucleic adds (PNA) which includes 
peptide nucleic acid analogs. These backbones are substantially non-ionic under neutral 

30 conditions, in contrast to the highly charged phosphodiester backbone of naturally occurring 
nucleic acids. This results in two advantages. First, the PNA backbone exhibits improved 
hybridization kinetics. PNAs have larger changes in the melting temperature (Tm) for 
mismatched versus perfectly matched basepairs. DNA and RNA typically exhibit a 2-4"C 
drop in Tm for an internal mismatch. With the non-ionic PNA backbone, the drop is closer to 
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7-9*^C. Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concratration. In addition, PNAs are not degraded 
by cellular enzymes, and thus can be more stable. 

[33] The nucleic acids may be single stranded or double stranded, as 
5 specified, or contain portions of both double stranded or single stranded sequence. As will be 
appreciated by those in the art, the depiction of a single strand ('Watson") also defines the 
sequence of the other strand ("Crick"); thus the sequences described herein also includes the 
complement of the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA 
or a hybrid, where the nucleic acid contains any combination of deoxyribo- and ribo- 

10 nucleotides, and any combination of bases, includmg uracil, adenine, thymine, cytosine, 
guanine, inosine, xanthine hypoxanthine, isocytosine, isoguanine, etc. As used herein, the 
term "nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
nucleosides such as amino modified nucleosides. In addition, '^nucleoside" includes non- 
naturally occurring analog structures. Thus for example the individual units of a peptide 

15 nucleic acid, each containing a base, are referred to herein as a nucleoside. 

[34] A colorectal cancer sequence can be initially identified by substantial . 
nucleic acid and/or amino acid sequence homology to the colorectal cancer sequences 
outlined herein. Such homology can be based upon the overall nucleic acid or amino acid 
sequence, and is generally determined as outlined below, using either homology programs or 

20 hybridization conditions. 

[35] The isolation of mRNA comprises isolating total cellular RNA by 
disrapting a cell and perfotnung differential centrifixgation. Once the total RNA is isolated, 
mRNA is isolated by making use of the adenine nucleotide residues known to those skilled in 
the art as a poly (A) tail found on virtually every eukaryotic mRNA molecule at the 3'end 

25 thereof Oligonucleotides composed of only deoxythymidine [olgo(dT)] are linked to 

cellulose and the oligo(dT)-cellulose packed into small columns. When a preparation of total 
cellular RNA is passed through such a column, the mRNA molecules bind to the oligo(dT) by 
the poly (A) tails while the rest of the RNA flows through the colunm. The bound mRNAs 
are then eluted firom the column and collected. 

30 [36] The colorectal cancer sequences of ttie invention can be identified as 

follows. Samples of normal and tumor tissue are applied to biochips comprising nucleic acid 
probes. The samples are first microdissected, if applicable, and treated as described above 
for the preparation of mRNA. Suitable biochips are conunercially available, for example 
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from Afifymetrix, Geae expression profiles as described herein are generated, and the data 
analyzed. 

[37] In a preferred embodunent, the genes showing changes in e3q)ression 
as between normal and disease states are compared to genes expressed in other normal 
5 tissues, including, but not limited to lung, heart, brain, liver, breast, kidney, muscle, prostate, 
small intestine, large intestine, spleen, bone, and placenta. In a preferred embodiment, those 
genes identified during the colorectal cancer screen that are expressed in any significant 
amount in other tissues are removed fix)m flie profile, although in some embodiments, this is 
not necessary. That is, when screening for drags, it is preferable that the target be disease 

1 0 specific, to minimize possible side effects. 

[38] In a preferred embodiment, colorectal cancer sequences are those that 
are up-regulated in colorectal cancer ; that is, the expression of these genes is higher in 
colorectal carcinoma as compared to normal colon tissue. *TJp-regulation" as used herein 
means at least about a 1.1 fold change, preferably a 1.5 or two fold change, preferably at least 

15 about a three fold change, with at least about five-fold or higher being preferred. All 

accession numbers herein are for the GenBank sequence database and the sequences of the 
accession numbers are hereby expressly incorporated by reference. GenBank is known in the 
art, see, e.g., Benson, DA, et al.. Nucleic Acids Research 26:1-7 (1998) and 
http://www.ncbi.nlm.nih.gov/. In addition, these genes were found to be expressed in a 

20 limited amount or not at all in heart, brain, lung, Uver, breast, kidney, prostate, small intestine 
and spleen. 

[39] In a preferred embodiment, colorectal cancer sequences are those that 
are down-regulated in colorectal cancer ; that is, the expression of these genes is lower in 
colorectal carcinoma as compared to normal colon tissue. **Down-regulation" as used herein 

25 means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. 

[40] Colorectal cancer proteins of the present invention may be classified 
as secreted proteins, transmembrane proteins or intracellular proteins. In a preferred 
embodiment the colorectal cancer protein is an intracellular protein. Intracellular proteins 

30 may be found in the cytoplasm and/or in the nucleus. Intracellular proteins are involved in all 
aspects of cellular fimction and replication (including, for example, signaling pathways); 
aberrant expression of such proteins results in unregulated or disregulated cellular processes. 
For example, many intracellular proteins have enzymatic activity such as protein kinase 
activity, protein phosphatase activity, protease activity, nucleotide cyclase activity, 
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polymerase activity and the like. Intracellular proteins also serve as docking proteins that are 
involved in organizing complexes of proteins, or targeting proteins to various subcellular 
localizations, and are involved in maintaining the structural integrity of organelles. 

[41] An increasingly appreciated concept in characterizing intracellular 
5 proteins is the presence in the proteins of one or more motife for which defined fonctions 
have been attributed. In addition to the highly conserved sequences found in the enzymatic 
domain of proteins, highly conserved sequences have been identified in proteins that are 
involved in protein-protem interaction. For example, Src-homology-2 (SH2) domains bind 
tyrosine-phosphoiylated targets in a sequence dependent manner. PTB domains, which are 

10 distinct firom SH2 domains, also bind tyrosine phosphorylated targets. SHS domains bind to 
proHne-rich targets. In addition, PH domains, tetratricopeptide repeats and WD domains to 
name only a few, have been shown to mediate protein-protein interactions. Some of these 
may also be involved in binding to phospholipids or other second messengers. As will be 
appreciated by one of ordinary skiU in the art, these motifs can be identified on the basis of 

15 primary sequence; thus, an analysis of the sequence of proteins may provide insight into both 
the enzymatic potential of the molecule and/or molecules with which the protein may 
associate. 

[42] In a preferred embodiment, the colorectal cancer sequences are 
transmembrane proteins. Transmembrane proteins are molecules that span the phospholipid 

20 bilayer of a cell. They may have an intracellular domain, an extracellular domain, or both. 
The intracellular domaios of such proteins may have a number of functions including those 
ahready described for intracellular proteins. For example, the intracellular domain may have 
enzymatic activity and/or may serve as a binding site for additional proteins. Frequently the 
intracellular domain of transmembrane proteins serves both roles. For example certain 

25 receptor tyrosine kinases have both protein kinase activity and SH2 domains. In addition, 
autophosphorylation of tyrosines on the receptor molecule itself creates binding sites for 
additional SH2 domain containing proteins. 

[43] Transmembrane proteins may contain firom one to many 
transmembrane domains. For example, receptor tyrosine kinases, certain cytokine receptors, 

30 receptor guanylyl cyclases and receptor serine/threonine protein kinases contain a single 
transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
are classified as ''seven transmembrane domain" proteins, as they contain 7 membrane 
spanning regions. Important transmembrane protein receptors include, but are not limited to 

12 
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insulin receptor, insulin-like growth factor recqptor, human growth hormone recqptor, 
glucose transporters, transferrin receptor, epidermal growth factor receptor, low density 
lipoprotein receptor, epidermal growth factor receptor, leptm receptor, interleukin receptors, 
e.g. IL-1 receptor, TL-2 rec^tor, etc. 
5 [44] Characteristics of transmembrane domains include approximately 20 

consecutive hydrophobic amino acids that may be followed by charged amino acids. 
Therefore, upon analysis of the amino acid sequence of a particular protein, the localization 
and number of transmembrane domains within the protein may be predicted. 

[45] The extracellular domains of transmembrane proteins are diverse; 

1 0 howevor, conserved motifs are found repeatedly among various extracellular domains. 

Conserved structure and/or functions have been ascribed to different extracellular motifs. For 
example, cytokine receptors are characterized by a cluster of cysteines and a WSXWS (W= 
tryptophan, S= serine, X=any amino acid) motif. Immunoglobulin-like domains are highly 
conserved. Mucin-like domains may be involved in cell adhesion and leucine-rich repeats 

1 5 participate in proteia-protein interactions. 

[46] Many extracellular domains are involved in binding to other 
molecules. In one aspect, extracellular domains are receptors. Factors that bind the receptor 
domain include circulating Ugands, which may be peptides, proteins, or small molecules such 
as adenosine and the like. For example, growth factors such as EGF, FGF and PDGF are 

20 circulating growth factors that bind to their cognate receptors to initiate a variety of cellular 
responses. Other factors include cytokines, mitogenic factors, neurotrophic factors and the 
like. Extracellular domains also bind to cell-associated molecules. In this respect, they 
mediate cell-cell interactions. Cell-associated ligands can be tethered to the cell for example 
via a glycosylphosphatidylinositol (GPI) anchor, or may themselves be transmembrane 

25 proteins. Extracellular domains also associate with the extracellular matrix and contribute to 
the maintenance of the cell structure. 

[47] Colorectal cancer proteins that are transmembrane are particularly 
preferred in the present invention as they are good targets for inomunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 

30 in imaging modalities. 

[48] It will also be appreciated by those in the art that a transmembrane 
protein can be made soluble by removing transmembrane sequences, for example through 
recombinant methods. Furthermore, transmembrane proteins that have been made soluble 
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can be made to be secreted through recombinant means by adding an appropriate signal 
sequence. 

[49] In a preferred embodiment, the colorectal cancer proteins are secreted 
proteins; the secretion of which can be either constitutive or regulated. These proteins have a 
5 signal peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 
an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 

10 on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology, colorectal cancer proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, for 
example for blood tests. 

[50] A colorectal cancer sequence is initially identified by substantial 

15 nucleic acid and/or amino acid sequence homology to the colorectal cancer sequences 
outlined herem. Such homology can be based upon the overall nucleic acid or amino acid 
sequence, and is generally determined as outlined below, using either homology programs or 
hybridization conditions. 

[51] As used hereia, the terms "colorectal cancer nucleic acid", "colorectal 

20 cancer protein" or "colorectal cancer polynucleotide" or "colorectal cancer-associated 

transcripf * refers to nucleic acid and polypq)tide polymorphic variants, alleles, mutants, and 
interspecies homologs that: (1) have a nucleotide sequence that has greater than about 60% 
nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide sequence identity, preferably ovct a 

25 region of over aregion of at least about 25, 50, 100, 200, 500, 1000, or more nucleotides, to a 
nucleotide sequence of or associated with a unigene cluster of Tables 1 or Table 2; (2) bind to 
antibodies, e.g., polyclonal antibodies, raised against an inmiunogen comprising an amino 
acid sequence encoded by a nucleotide sequence of or associated with a unigene cluster of 
Table 1 or Table 2, and conservatively modified variants thereof; (3) specifically hybridize 

30 under stringent hybridization conditions to a nucleic acid sequence, or the complement 
thereof of Table 1 or Table 2 and conservatively modified variants thereof or (4) have an 
amino acid sequence that has greater than about 60% amino acid sequence identity, 65%, 
70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% 
or greater amino sequence identity, preferably over a region of over a region of at least about 

14 
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25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence encoded by a 
nucleotide sequence of or associated with a unigene cluster of Table 1 or Table 2. A 
polynucleotide or polypeptide sequ^ce is typically from a maromal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
5 other mammal. A "colorectal cancer polypeptide" and a "colorectal cancer polynucleotide," 
include both naturally occurring or recombinant. 

[52] Homology in this context means sequence similarity or identity, with 
identity being preferred A preferred comparison for homology purposes is to compare tiie 
sequence containing sequencing errors to the correct sequence. This homology will be 

10 determined using standard techniques known in the art, including, but not limited to, the local 
homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the 
homology alignment algorithm of Needleman & Wunsch, J. MoL Biool. 48:443 (1970), by 
the search for similarity method of Pearson & Lipman, PNAS USA 85:2444 (1988), by 
computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA 

15 in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Drive, 
Madison, WI), the Best Fit sequence program descaibed by Devereux et al., NucL Acid Res. 
12:387-395 (1984), preferably using the default settings, or by inspection. 

[53] In a preferred embodiment, the sequences which are used to determine 
sequence identity or similarity are selected from the sequences set forth in Table 1 or Table 2. 

20 In one embodiment the sequences utilized herein are those set forth in Table 1 or Table 2. In 
another embodiment, the sequences are naturally occurring allelic variants of the sequences 
set forth in Table 1 or Table 2. In another embodiment, the sequences are sequence variants 
as further described herein. 

[54] The terms "identical" or percent "identity," m the context of two or 

25 more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences 
that are the same or have a specified percentage of amino acid residues or nucleotides that are 
the same (i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared 
and aligned for maximum correspondence over a comparison window or designated region) 

30 as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection {see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 
be "substantially identical." This definition also refers to, or may be applied to, the 
compliment of a test sequence. The definition also includes sequences that have deletions 
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and/or additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
algorithms can account for gaps and flie like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
5 50-100 amino acids or nucleotides in length. 

[551 For sequence comparison, typically one sequence acts as a reference 
sequence, to which test sequences are compared. When using a sequence comparison 
algorithm, test and reference sequences are entered into a computer, subsequence coordinates 
are designated, if necessary, and sequence algorithm program parameters are designated. 
1 0 Preferably, default program parameters can be used, or alternative parameters can be 
designated. The sequence comparison algorithm then calculates the percent sequence 
identities for the test sequences relative to the reference sequence, based on the program 
parameters. 

[56] A "comparison window", as used herein, includes reference to a 

1 5 segment of one of the number of contiguous positions selected from the group consisting 
typically of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 
150 in which a sequence may be compared to a reference sequence of the same number of 
contiguous positions after the two sequences are optimally aligned. Methods of aligmnent of 
sequences for comparison are well-known in the art. Optimal alignment of sequences for 

20 comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, 
Adv. AppL Math. 2:482 (1981), by the homology alignment algorithm of Needleman & 
Wunsch, /. MoL Biol. 48:443 (i970), by the search for similarity method of Pearson & 
Lipman, Proc. Natl Acad Set USA 85:2444 (1988), by computerized implementations of 
these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics 

25 Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual 
aligmnent and visual inspection {see, e.g., Current Protocols in Molecular Biology (Ausubel 
et al., eds. 1995 supplement)). 

[57] Preferred examples of algorithms that are suitable for determining 
percent sequence identity and sequoice similarity mclude the BLAST and BLAST 2.0 

30 algorithms, which are described in Altschul et al., Nuc. Acids Res. 25:3389-3402 (1977) and 
Altschul et ai, X MoL Biol. 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the 
parameters described herein, to determine percent sequence identity for the nucleic acids and 
proteins of the invention. Software for performing BLAST analyses is publicly available 
through the National Center for Biotechnology Monnation (http://www.ncbi.nlm.nih.govy). 

16 
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This algorithm involves first identifying high scoring sequence pairs OiSPs) by identifying 
short words of length W in the query sequence, which either match or satisfy some positive- 
valued threshold score T when aligned witibi a word of the same length in a database 
sequence. T is referred to as the neighborhood word score flireshold (Altschul et aU supra). 
5 These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs 
containing them. The word hits are extended in both directions along each sequence for as 
far as the cumulative alignment score can be increased. Cumulative scores are calculated 
using, e.g., for nucleotide sequences, the parameters M (reward score for a pair of matching 
residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino 

10 acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the 
word hits in each direction are halted when: the cumulative aligmnent score falls off by the 
quantity X firom its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue aligmnents; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 

15 sensitivity and speed of the aUgmnent. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N==-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
{see Henikofr& Henikoff, Proc. Natl Acad. Set USA 89:10915 (1989)) aligmnents (B) of 

20 50, expectation (E) of 1 0, M==5, N==-4, and a comparison of both strands. 

[58] The BLAST algorithm also performs a statistical analysis of the 
similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat 7. Acad Set USA 
90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the 
smallest sum probability (P(N)), which provides an indication of the probability by which a 

25 match between two nucleotide or amino acid sequences would occur by chance. For 
example, a nucleic acid is considered similar to a reference sequence if the smallest sum 
probability in a comparison of the test nucleic acid to the reference nucleic acid is less than 
about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001. 
Log values may be large negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 150, 170, 

30 etc. 

[59] In one embodiment, the nucleic acid homology is determined through 
hybridization studies. Thus, for example, nucleic acids which hybridize under high 
stringency to the nucleic acid sequmces which encode the peptides identified in Table 1 or 
Table 2, or flieir complements, are considered a colorectal cancer sequence. High stringency 

17 
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conditions are known in the art; see for example Maniatis et al., Molecular Cloning: A 
Laboratory Manual, 2d Edition, 1989, and Short Protocols in Molecular Biology, ed. 
Ausubel, et al., both of which are hereby incorporated by reference. Stringent conditions are 
sequence-dependent and will be different in different circumstances. Longer sequences 
5 hybridize specifically at higher temperatures. An extensive guide to the hybridization of 
nucleic acids is found in Tijssen, Techniques in Biochemistry and Molecular Biology- 
Hybridization with Nucleic Acid Probes, "Overview of principles of hybridization and the 
strategy of nucleic acid assays" (1993). Generally, stringent conditions are selected to be 
about S-IO'^'C lower than the thermal melting point (Tm) for the specific sequence at a 

10 defined ionic strength pH. The Tm is the temperature (under defiined ionic strength, pH and 
nucleic acid concentration) at which 50% of the probes complementary to the target hybridize 
to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 
50% of the probes are occupied at equilibrium). Stringent conditions will be those in which 
the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M 

15 sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 
30°C for short probes (e.g. 10 to 50 nucleotides) and at least about 60°C for long probes (e.g. 
greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of 
destabilizing agents such as formamide. 

[60] In another embodiment, less stringent hybridization conditions are 

20 used; for example, moderate or low stringency conditions may be used, as are known in the 
art; see Maniatis and Ausubel, supra, and Tijssen, supra. For selective or specific 
hybridization, a positive signal is at least two times background, preferably 10 times 
background hybridization. Exemplary stringent hybridization conditions can be as following: 
50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating 

25 at 65°C, with wash in 0.2x SSC, and 0.1% SDS at 65*^0. 

[61] Nucleic acids that do not hybridize to each other under stringent 
conditions are still substantially identical ifthe polypeptides which they encode are 
substantially identical. This occurs, for example, when a copy of a nucleic acid is created 
using the maximum codon degeneracy permitted by the genetic code. In such cases, the 

30 nucleic acids typically hybridize imder moderately stringent hybridization conditions. 
Exemplary ^'moderately stringent hybridization conditions" include a hybridization in a 
buffer of 40% formamide, 1 M NaCl, 1% SDS at 3rc, and a wash in IX SSC at 45°C. A 
positive hybridization is at least twice background. Those of ordinary skill will readily 
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recognize that alternative hybridization and wash conditions can be utilized to provide 
conditions of similar stringency. Additional guidelines for determining hybridization 
parameters are provided in numerous reference, e.g., and Current Protocols in Molecular 
Biology, ed. Ausubel, et al 
5 [62] For PGR, a temperature of about 36*^0 is typical for low stringency 

amplijScation, although annealing temperatures may vary between about 32®C and 48®C 
depending on primer length. For high stringency PGR amplification, a temperature of about 
62^C is typical, although high stringency annealing temperatures can range frpm about 50*^C 
to about 65®C, depending on the primer length and specificity. Typical cycle conditions for 
10 both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 nun., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 
72**C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 
reactions are provided, e.g., in Innis et al, PCR Protocols, A Guide to Methods and 
Applications (1990). 

15 [63] In addition, the colorectal cancer nucleic acid sequences of the 

invention are firagments of larger genes, i.e. they are nucleic acid segments. "Genes" in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
herein, additional sequences of the colorectal cancer genes can be obtained, using techniques 

20 well known in the art for cloning either longer sequences or the fiiU length sequences; see 
Maniatis et al., and Ausubel, et al, supra, hereby expressly incorporated by reference. 

[64] An indication that two nucleic acid sequences or polypeptides are 
substantially identical is that the polypeptide encoded by the first nucleic acid is 
immunologically cross reactive with the antibodies raised against the polypeptide encoded by 

25 the second nucleic acid. Thus, a polypeptide is typically substantially identical to a second 
polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is &at the two molecules 
or their complements hybridize to each other under stringent conditions, as described above. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 

30 same primers can be used to amplify the sequences. 

[65] Once the colorectal cancer nucleic acid is identified, it can be cloned 
and, if necessary, its constituent parts recombined to form the entire colorectal cancer nucleic 
acid. Once isolated from its natural source, e.g., contained within a plasmid or other vector 
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or excised therefirom as a linear nucleic acid segment, the recombinant colorectal cancer 
nucleic acid can be further-used as a probe to identify and isolate other colorectal cancer 
nucleic acids, for example additional coding regions. It can also be used as a "precursor" 
nucleic acid to make modified or variant colorectal cancer nucleic acids and proteins. 
5 [66] The colorectal cancer nucleic acids of the present invention are used in 

several ways. In a first embodiment, nucleic acid probes to the colorectal cancer nucleic 
acids are made and attached to biochips to be used in screening and diagnostic methods, as 
outlined below, or for administration^ for example for gene therapy and/or antisense 
applications. Alternatively, the colorectal cancer nucleic acids that include coding regions of 

10 colorectal cancer proteins can be put into expression vectors for the expression of colorectal 
cancer proteins, again either for screening purposes or for administration to a patient. 

[67] In a preferred embodiment, nucleic acid probes to colorectal cancer 
nucleic acids (both the nucleic acid sequences encoding pq)tides outlined in the Table 1 or 
Table 2 and/or the complements thereof) are made. The nucleic acid probes attached to the 

15 biochip are designed to be substantially complementary to the colorectal cancer nucleic 
acids, i.e. the target sequence (either tiie target sequence of the sample or to other probe 
sequences, for example in sandwich assays), such that hybridization of the target sequence 
and the probes of the present invention occurs. As outlined below, this complementarity need 
not be perfect; there may be any number of base pair mismatches which will interfere with 

20 hybridization between the target sequence and the single stranded nucleic acids of the present 
invention. However, if the number of mutations is so great that no hybridization can occur 
under even the least stringent of hybridization conditions, the sequence is not a 
complementary target sequence. Thus, by "substantially complementary" herein is meant 
that the probes are sufficiently complementary to the target sequences to hybridize under 

25 normal reaction conditions, particularly high stringency conditions, as outlined herein. 

[68] A nucleic acid probe is generally single stranded but can be partially 
single and partially double stranded. The strandedness of the probe is dictated by the 
structure, composition, and properties of tiie target sequence. In general, the nucleic acid 
probes range from about 8 to about 100 bases long, with &om about 10 to about 80 bases 

30 being preferred, and from about 30 to about 50 bases being particularly prefOTcd. That is, 
generally whole genes are not used. In some embodiments, much longer nucleic acids can be 
iised, up to hundreds of bases. 

[69] In a preferred embodiment, more than one probe per sequence is used, 
with either overlapping probes or probes to different sections of the target being used. That 

20 
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is, two, three, four or more probes, with three being preferred, are used to build in a 
redundancy for a particular target. The probes can be overlapping (i.e. have some sequence 
in conomon), or separate. 

[70] As will be appreciated by those in the art, nucleic acids can be 
5 attached or immobilized to a solid support in a wide variety of ways. By ^Immobilized" and 
grammatical equivalents herein is meant the association or binding between the nucleic acid 
probe and the solid support is sufficient to be stable imder the conditions of binding, washing, 
analysis, and removal as outlined below. The binding can be covalent or non-covalent By 
**non-covalent binding" and grammatical equivalents herein is meant one or more of either 

10 electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is 
the covalent attachment of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent binding" and 
grammatical equivalents herein is meant that the two moieties, the solid support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 

15 bonds. Covalent bonds can be formed directly betwera the probe and the solid support or can 
be formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

[71] In general, the probes are attached to the biochip in a wide variety of 

20 ways, as will be appreciated by those in the art. As described herein, the nucleic acids can 
either be synthesized first, with subsequent attachment to the biochip, or can be directly 
synthesized on the biochip. 

[72] The biochip comprises a suitable solid substrate. By "substrate" or 
"solid support" or other grammatical equivalents herein is meant any matOTal that can be 

25 modified to contain discrete individual sites appropriate for the attachment or association of 
flie nucleic acid probes and is amenable to at least one detection method. As will be 
{^predated by tiiose in the art, the nuniber of possible substrates are very large, and include, 
but are not limited to, glass and modified or fimctionalized glass, plastics (including acrylics, 
polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, 

30 polybutylene, polyurethanes, TefloiJ, etc.), polysaccharides, nylon or nitrocellulose, resins, 
silica or silica-based materials including silicon and modified silicon, carbon, metals, 
inorganic glasses, plastics, etc. In general, the substrates allow optical detection and do not 
£q)preciably fluoresce. A preferred substrate is described in copending application entitled 
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Reusable Low Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270^14, filed 
March 15, 1999, herein incorporated by reference in its entirety. 

[73] Generally the substrate is planar, although as will be appreciated by 
those in flie art, otiio- configurations of substrates may be used as well. For example, tibe 
5 probes may be placed on the inside surface of a tube, for flow-through sample analysis to 
minimize sample volimie. Similarly, the substrate may be.flexible, such as a flexible foam, 
including closed cell foams made of particular plastics* 

[74] In a preferred embodiment, the surface of the biochip and the probe 
may be derivatized with chemical fimctional groups for subsequent attachment of the two. 

10 Thus, for example, the biochip is derivatized witti a chemical fimctional group including, but 
not limited to, amino groups, carboxy groups, oxo groups and thiol groups, with amino 
groups being particvdarly preferred. Using these fimctional groups, the probes can be 
attached using fimctional groups on the probes. For example, nucleic acids containing anrnio 
groups can be attached to surfaces comprising amino groups, for example using linkers as are 

15 known in the art; for example, bomo-or hetero-bifimctional linkers as are well known (see 
1994 Pierce Chemical Company catalog, technical section on cross-linkers, pages 155-200, 
incorporated herein by reference). In addition, in some cases, additional linkers, such as 
alkyl groups (including substituted and heteroalkyl groups) may be used. 

[75] In this embodiment, the oligonucleotides are synthesized as is known 

20 in the art, and then attached to the surface of the soUd support. As will be appreciated by 
those skilled in the art, either the 5* or 3' terminus may be attached to the solid support, or 
attachment may be via an intemal nucleoside. 

[761 ^ an additional embodiment, the immobilization to the soUd support 
may be very strong, yet non-covalent. For example, biotinylated oUgonucleotides can be 

25 made, which bind to surfaces covalently coated with streptavidin, resulting in attachment. 

[77] Alternatively, the oligonucleotides may be synthesized on the surface, 
as is known in the art. For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used. In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

30 m WO 95/25116; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 
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178] In a preferred embodiment, colorectal cancer nucleic acids encoding 
colorectal cancer proteins are used to make a variety of expression vectors to e^q)ress 
colorectal cancer proteins which can then be used in screening assays, as desaibed below. 
The expression vectors may be either self-replicating extrachromosomal vectors or vectors 
5 which integrate into a host genome. Generally, these expression vectors include 

transcriptional and translational regulatory nucleic acid operably linked to the nucleic acid 
encoding the colorectal cancer protein. The term "control sequences" refers to DNA 
sequences necessary for the expression of an operably linked coding sequence in a particular 
host organism. The control sequences that are suitable for prokaryotes, for example, include 

10 a promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells 
are known to utilize promoters, polyadenylation signals, and enhancers. 

[79] Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 

15 that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 
to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 

20 do not have to be contiguous. Linking is accomplished by ligation at convenient restriction 
sites. If such sites do not exist, the synthetic oUgonucleotide ad^tors or Unkers are used in 
accordance with conventional practice. The transcriptional and translational regulatory 
nucleic acid will generally be appropriate to the host cell used to express the colorectal cancer 
protein; for example, transcriptional and translational regulatory nucleic acid sequences from 

25 Bacillus are preferably used to express the colorectal cancer protein in Bacillus. Numerous 
types of appropriate expression vectors, and suitable regulatory sequences are known in the 
art for a variety of host cells. 

[80] In general, the transcriptional and translational regulatory sequences 
may mclude, but are not limited to, promoter sequences, ribosomal binding sites, 

30 transcriptional start and stop sequences, translational start and stop sequences, and enhancer 
or activator sequences. In a preferred embodiment, the regulatory sequences include a 
promoter and transcriptional start and stop sequences. 

[81] Promoter sequences encode either constitutive or inducible promoters. 
The promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
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promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

[82] In addition, the expression vector may comprise additional elements. 
For example, the expression vector may have two replication systems, thus allowing it to be 
5 maintained in two organisms, for example in mammalian or insect cells for expression and in 
aprocaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the e;q)ression construct. 
The integrating vector may be directed to a specific locus in the host cell by selecting flie 

10 appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art. 

[831 Idl addition, in a preferred embodiment, the expression vector contains 
a selectable marker gene to allow the selection of transformed host cells. Selection genes are 
well known in the art and will vary with the host cell used. 

1 5 [84] The colorectal cancer proteins of the present invention are produced 

by culturing a host cell transformed with an expression vector containing nucleic acid 
encoding a colorectal cancer protein, under the appropriate conditions to induce or cause 
expression of the colorectal cancer protein. The conditions appropriate for colorectal cancer 
protein expression will vary with the choice of the expression vector and the host cell, and 

20 will be easily ascertained by one skilled in the art through routine experimentation. For 
example, the use of constitutive promoters in the expression vector will require optimizing 
the growth and proliferation of the host ceU, while the use of an inducible promoter requires 
the appropriate growth conditions for induction. In addition, in some embodiments, the 
timing of the harvest is important. For example, the baculoviral systems used in insect cell 

25 expression are lytic viruses, and thus harvest time selection can be crucial for product yield. 

[85] Appropriate host cells include yeast, bacteria, archaebacteria, fimgi, 
and insect and animal cells, including mammalian cells. Of particular interest are Drosophila 
melangaster cells, Saccharomyces cerevisiae and other yeasts, E. coli, BaciUus subtilis, S& 
cells, C129 cells, 293 cells, Neurospora, BHK, CHO, COS, HeLa cells, THPl ceU line (a 

30 macrophage cell line) and human cells and cell lines. 

[86] In a preferred embodiment, the colorectal cancer proteins are 
expressed in mammalian cells. Mammalian expression systems are also known in the art, and 
include retroviral systems. A preferred expression vector system is a retroviral vector system 
such as is generally described in PCTAJS97/01019 and PCT/US97/01048, both of which are 
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hereby expressly incorporated by reference. Of particular use as mammalian promotes are 
the promoters fix)m mammalian viral genes, since the viral genes are often highly expressed 
and have a broad host range. Examples include the S V40 early promoter, mouse mammary 
tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, 
5 and the C!MV promoter. Typically, transcription termination and polyadenylation sequences 
recognized by mammalian cells are regulatory regions located 3' to the translation stop codon 
and thus, together with flie promoter elements, flank the coding sequence. Examples of 
transcription terminator and polyadenlytion signals include those derived form SV40. 

[87] The methods of introducing exogenous nucleic acid into mammalian 
10 hosts, as well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide(s) in Uposomes, and direct microinjection of the DNA 
into nuclei. 

15 [88] In a preferred embodiment, colorectal cancer proteins are expressed in 

bacterial systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; for example, the tac promoter is a hybrid of the trp and 
lac promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

20 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functiomng promoter sequence, an efficient ribosome 
binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the colorectal cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 

25 between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetmcycUne. Selectable markers also include biosynthetic genes, 

30 such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli. Streptococcus cremoris, 
and Streptococcus lividans, among others. The bacterial expression vectors are transformed 
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into bacterial host cells using techniques well known in the art, such as calcium chloride 
treatment, electroporation, and others. 

[89] In one embodiment, colorectal cancer proteins are produced in insect 
cells. Expression vectors for the transformation of insect cells, and in particular, baculoviras- 
5 based expression vectors, are well known in the art. 

[90] In a preferred embodiment, colorectal cancer protein is produced in 
yeast cells. Yeast e7q)ression systems are well known in die art, and include expression 
vectors for Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula 
polymorpha, Kiuyveromyces fragilis and K, lactis, Pichia guillerimondii and P. pastoris, 

1 0 Schizosaccharomyces pombe, and Yarrowia lipolytica. 

[91] The colorectal cancer protein may also be made as a fusion protein, 
using techniques well known in the art. Thus, for example, for the creation of monoclonal 
antibodies, if the desired epitope is small, the colorectal cancer protein may be fused to a 
carrier protein to form an immunogen. Altematively, the colorectal cancer protein may be 

1 5 made as a fusion protein to increase expression, or for other reasons. For example, when the 
colorectal cancer protein is a colorectal cancer pq)tide, the nucleic acid encoding the peptide 
may be linked to other nucleic acid for expression pxuposes. 

[92] In one embodiment, the colorectal cancer nucleic acids, proteins and 
antibodies of the invention are labeled. By "labeled" herein is meant that a compoxmd has at 

20 least one element, isotope or chemical compoxmd attached to enable the detection of the 
compound. In general, labels fall mto three classes: a) isotopic labels, which may be 
radioactive or heavy isotopes; b) immune labels, which may be antibodies or antigens; and c) 
colored or fluorescent dyes. The labels may be incorporated into the colorectal cancer 
nucleic acids, proteins and antibodies at any position. For example, the label should be 

25 capable of producing, either directly or indirectly, a detectable signal. The detectable moiety 
may be a radioisotope, such as 3H, 14C, 32P, 35S, or 1251, a fluorescent or 
chemiluminescent compound, such as fluorescein isothiocyanate, rhodamine, or luciferin, or 
an enzyme, such as alkaline phosphatase, beta-galactosidase or horseradish peroxidase. Any 
method known in the art for conjugating the antibody to the label may be employed, 

30 including those methods described by Hunter et al.. Nature, 144:945 (1962); David et al.. 
Biochemistry, 13:1014 (1974); Pain et al., J. Immunol. Meth., 40:219 (1981); andNygren, J. 
Histochem. and Cytochem., 30:407 (1982). 

[93] Accordingly, the present invention also provides colorectal cancer 
protein sequences. A colorectal cancer protein of the present invention may be identified in 

26 
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several ways. 'TProtein" in this sense includes proteins, polypeptides, and peptides terms 
which are used interchangeably herein to refer to a polymer of amino acid residues. The 
terms ^ply to amino acid polymers in which one or more amino acid residue is an artificial 
chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturaUy 
5 occurring amino acid polymers, those containing modified residues, and non-naturally 
occurring amino acid polymer. 

[94] As will be appreciated by those in the art, the nucleic acid sequences 
of the invention can be used to generate protein sequences. There are a variety of ways to do 
this, including cloning the entire grae and verifying its firame and amino acid sequence, or by 

10 comparing it to known sequences to search for homology to provide a fi-ame, assuming the 
colorectal cancer protein has homology to some protein in the database being used. 
Generally, the nucleic acid sequences are input into a program that will search all three 
frames for homology. This is done in a preferred embodiment using the following NCBI 
Advanced BLAST parameters. The program is blastx or blastn. The database is nr. The, 

15 input data is as "Sequence in FASTA formaf Tlie organism list is "none". The "e:q)ect" is 
10; the filter is default. The "descriptions" is 500, the "aligmnents" is 500, and the 
"aUgmnent view" is pairwise. The "Query Genetic Codes" is standard (1). The matrix is 
BLOSUM62; gap existence cost is 1 1, per residue gap cost is 1; and the lambda ratio is .85 
default. This results in the generation of a putative protein sequence. 

20 [951 Also included within one embodiment of colorectal cancer proteins 

are amino acid variants of the naturally occurring sequences, as determined herein. 
Preferably, the variants are preferably greater than about 75% homologous to the wild-type 
sequence, more preferably greater than about 80%, even more preferably greater than about 
85% and most preferably greater than 90%. In some embodiments the homology will be as 

25 high as about 93 to 95 or 98%. As for nucleic acids, homology in this context means 
sequence similarity or identity, with identity being preferred. This homology will be 
determined using standard techniques known in the art as are outlined above for the nucleic 
acid homologies. 

[96] Colorectal cancer proteins of the present invention may be shorter or 
30 longer than ttie wild type amino acid sequences. Thus, in a preferred embodiment, included 
within the definition of colorectal cancer proteins are portions or fragments of the wild type 
sequences, herein. In addition, as outlined above, the colorectal cancer nucleic acids of the 
invention may be used to obtain additional coding regions, and thus additional protein 
sequence, using techniques known in the art. 
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[97] In a preferred embodiment, the colorectal cancer proteins are 
derivative or variant colorectal cancer proteins as compared to the wild-type sequence. That 
is, as outlined more fully below, the derivative colorectal cancer peptide will contain at least 
one amino acid substitution, deletion or insertion, with amino acid substitutions being 
5 particularly preferred. The amino acid substitution, insertion or deletion may occur at any 
residue within the colorectal cancer peptide. 

[98] Also included in an embodiment of colorectal cancer proteins of the 
preset invention are amino acid sequence variants. These variants fall into one or more of 
three classes: substitutional, insertional or deletional variants. These variants ordinarily are 

10 prepared by site specific mutagenesis of nucleotides in the DNA encoding the colorectal 

cancer protein, using cassette or PGR mutagenesis or other techniques well known in the art, 
to produce DNA encoding the variant, and thereafter expressing the DNA in recombinant cell 
culture as outlined above. However, variant colorectal cancer protein firagments having up to 
about 100-150 residues may be prepared by in vitro synthesis using established techniques. 

15 Amino acid sequence variants are characterized by the predetermined nature of the variation, 
a feature that sets them apart fiom naturally occurring allelic or interspecies variation of the 
colorectal cancer protein amino acid sequence. The variants typically exhibit the same 
qualitative biological activity as the naturally occmring analogue, although variants can also 
be selected which have modified characteristics as will be more fiiUy outlined below. 

20 [99] While the site or region for introducing an amino acid sequence 

variation is predetermined, the mutation per se need not be predetermined. For example, in 
order to optimize the performance of a mutation at a given site, random mutagenesis may be 
conducted at the target codon or region and the expressed colorectal cancer variants screened 
for the optimal combination of desired activity. Techniques for making substitution 

25 mutations at predetermined sites in DNA having a knoAvn sequence are well known, for 

example, M13 primer mutagenesis and PGR mutagenesis. Screening of the mutants is done 
using assays of colorectal cancer protein activities. 

[100] Amino acid substitutions are typically of single residues; insertions 
usually will be on the order of firom about 1 to 20 amino acids, although considerably larger 

30 insertions may be tolerated. Deletions range &om about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

[101] Substitutions, deletions, insertions or any combination thereof may be 
used to arrive at a final derivative. Generally these changes are done on a few amino acids to 
minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
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circumstances. When small alterations in the characteristics of the colorectal cancer protein 
are desired, substitutions are generally inade in accordance with the following chart: 

Chart I 

Original Residue Exenqilary Substitutions 

5 



Ala 


Ser 




T .V<! 


Asn 


Cr\n TTiQ 


i\sp 


(rill 




Ser 


Orlll 




VJIU 




vjiy 


Prft 


JlUS 


AqTI rrlfl 


Tie 


Leu Val 


J-rCli 


Tie Val 


Lys 


Arg, Gin, Glu 


Met 


Leu, lie 


Phe 


Met, Leu, Tyr 


Ser 


Thr 


Thr 


Ser 


Tip 


Tyr 


Tyr 


Tip, Phe 


Val 


fle. Leu 



[102] Substantial changes in function or immunological identity are made by 
selecting substitutions that are less conservative than those shown in Chart I. For example, 
substitutions may be made which more significantty affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
30 the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 
polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl is 
substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue 
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having an electropositive side chain, e.g. lysyl, arginyl, or histidyl, is substituted for (or by) 
an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side 
chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine. 

I103J The variants typically exhibit the same quahtative biological activity 
5 and will elicit the same immune response as the naturally-occurring analogue, although 
variants also are selected to modify the characteristics of the colorectal cancer proteins as 
needed. Alternatively, the variant may be designed such that the biological activity of the 
colorectal cancer protein is altered. For example, glycosylation sites may be altered or 
removed. 

10 [104] Covalent modifications of colorectal cancer polypeptides are included 

within the scope of fliis invention. One type of covalent modification includes reacting 
targeted amino acid residues of a colorectal cancer polypeptide with an organic derivatizing 
agent that is capable of reacting with selected side chains or the N-or C-terminal residues of a 
colorectal cancer polypeptide. Derivatization with bifunctional agents is useful, for instance, 

1 5 for crossUnking colorectal cancer to a water-insoluble support matrix or surface for use in 
the method for purifying anti-colorectal cancer antibodies or screening assays, as is more 
fully described below. Commonly used crosslinldng agents include, e.g., l,l-bis(diazo- 
acetyl)-2-phenylethane, glutaraldehyde, N-hydroxy-succinimide esters, for example, esters 
with 4-azido-salicylic acid, homobifunctional imidoesters, including disuccinimidyl esters 

20 such as 3,3'-dithiobis-(succinimidyl-propionate), bifunctional maleimides such as bis-N- 
maleiniido-l,8-octane and agents such as methyl-3-[(p-azidophenyl)-dithio]pro-pioimi-date. 

[105] Other modifications include deamidation of glutaminyl and 
asparaginyl residues to the corresponding glutamyl and aspartyl residues, respectively, 
hydroxylation of proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or 

25 tyrosyl residues, methylation of the a-amino groups of lysine, arginine, and histidine side 
chains [T.E. Creighton, Proteins: Structure and Molecular Properties, W.H. Freeman & Co., 
San Francisco, pp. 79-86 (1983)], acetylation of the N-terminal amine, and amidation of any 
C-terminal carboxyl group. 

[106] Another type of covalent modification of the colorectal cancer 

30 polypeptide included within the scope of tiiis invention comprises altering the native 
glycosylation pattern of the polypeptide. "Altering the native glycosylation pattem" is 
intended for purposes herein to mean deleting one or more carbohydrate moieties found in 
native sequence colorectal cancer polypeptide, and/or adding one or more glycosylation sites 
tiiat are not present in the native sequence colorectal cancer polypeptide. 

30 
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[107] Addition of glycosylation sites to colorectal cancer polypeptides may 
be accompHshed by altering the amino acid sequence thereof. The alteration may be made, 
for example, by the addition of, or substitution by, one or more serine or threonine residues to 
the native sequence colorectal cancer polypeptide (for 0-linked glycosylation sites). The 
5 colorectal cancer amino acid sequence may optionally be altered through changes at the 
DNA level, particularly by mutating the DNA encoding the colorectal cancer polypeptide at 
preselected bases such that codons are generated that will translate into the desired amino 
acids. 

[108] Another means of increasing the number of carbohydrate moieties on 
10 the colorectal cancer polypeptide is by chemical or enzymatic coupling of glycosides to the 
polypeptide. Such methods are described in the art, e.g., in WO 87/05330 published 1 1 
September 1987, and in Aplin and Wriston, colorectal cancer Crit. Rev. Biochem., pp. 259- 
306 (1981). 

[109] Removal of carbohydrate moieties present on the colorectal cancer 

1 5 polypeptide may be accomplished chemically or enzymatically or by mutational substitution 
of codons encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are Imown in the art and described, for instance, by Hakimuddin, 
et al., Arch. Biochem. Biophys., 259:52 (1987) and by Edge et al.. Anal. Biochem., 118:131 
(198 1). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 

20 use of a variety of endo-and exo-glycosidases as described by Thotakura et al., Meth. 
EnzymoL, 138:350 (1987). 

[110] Another type of covalent modification of colorectal cancer comprises 
linking the colorectal cancer polypeptide to one of a variety of nonproteinaceous polymers, 
e.g., polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth 

25 in U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

[Ill] colorectal cancer polypeptides of the present invention may also be 
modified in a way to form chimeric molecules comprising a colorectal cancer polypeptide 
fiised to another, heterologous polypeptide or ammo acid sequence. In one embodiment, such 
a chimeric molecule comprises a fiision of a colorectal cancer polypeptide with a tag 

30 polypeptide which provides an epitope to which an anti-tag antibody can selectively bind. 

The epitope tag is generally placed at the amino-or carboxyMerminus of the colorectal cancer 
polypeptide. The presence of such epitope-tagged forms of a colorectal cancer polypeptide 
can be detected using an antibody against the tag polypeptide. Also, provision of the epitope 
tag enables the colorectal cancer polypeptide to be readily purified by afSnity purification 
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using an anti-tag antibody or another type of afiSnity matrix that binds to the epitope tag. In 
an alternative embodiment, the chimeric molecule may comprise a fusion of a colorectal 
cancer polypeptide wifli an immunoglobulin or a particular region of an immunoglobulin. 
For a bivalent form of the chimeric molecule, such a fusion could be to the Fc region of an 
5 IgG molecule. 

[112] Various tag polypeptides and their respective antibodies are well 
known in the art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly- 
his-gly) tags; the flu HA tag polypeptide and its antibody 12CA5 [Field et al., Mol. Cell. 
BioL, 8:2159-2165 (1988)]; the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 

10 antibodies thereto [Evan et al.. Molecular and Cellular Biology, 5:3610-3616 (1985)]; and the 
Herpes Simplex viras glycoprotein D (gD) tag and its antibody [Paborsky et al., Protein 
Engineering, 3(6):547-553 (1990)]. Other tag polypeptides include the Flag-peptide [Hopp et 
al., BioTechnology, 6:1204-1210 (1988)]; the KT3 epitope peptide [Martin et al., Science, 
255:192-194 (1992)]; tubulin epitope peptide [Skinner et al., J. Biol. Chem., 266:15163- 

15 15166 (1991)]; and the T7 gene 10 protein peptide tag [Lutz-Freyermuth et al, Proc. Natl. 
Acad. Sci. USA, 87:6393-6397 (1990)]. 

[113] Also included with the definition of colorectal cancer protein in one 
embodiment are other colorectal cancer proteins of the colorectal cancer family, and 
colorectal cancer proteins firom other organisms, which are cloned and expressed as outlined 

20 below. Thus, probe or degenerate polymerase chain reaction (PGR) primer sequences may be 
used to find other related colorectal cancer proteins fix>m humans or other organisms. As 
will be appreciated by those in the art, particularly usefiil probe and/or PGR primer sequences 
include the unique areas of the colorectal cancer nucleic acid sequence. As is generally 
known in the art, preferred PGR primers are firom about 15 to about 35 nucleotides in length, 

25 with firom about 20 to about 30 being preferred, and may contain inosine as needed. The 
conditions for the PGR reaction are well known in the art. 

[114] In addition, as is outlined herein, colorectal cancer proteins can be 
made that are longer than those depicted in the Table 1 or Table 2 for example, by the 
elucidation of additional sequences, the addition of epitope or purification tags, the addition 

30 of other fiision sequences, etc. 

[115] Golorectal cancer proteins may also be identified as being encoded by 
colorectal cancer nucleic acids. Thus, colorectal cancer proteins are encoded by nucleic 
acids that will hybridize to the sequences of the sequence listings, or their complements, as 
outlined herein. 
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[116] In a preferred embodiment, when the colorectal cancer protein is to be 
used to generate antibodies, for example for immmiotherapy, the colorectal cancer protein 
should share at least one epitope or determinant with the fiill length protein. By "epitope" or 
"determinant" h^ein is meant a portion of a protein which will generate and/or bind an 
5 antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies made 
to a smaller colorectal cancer protein will be able to bind to the full length protein. In a 
preferred embodiment, the epitope is unique; that is, antibodies generated to a unique epitope 
show little or no cross-reactivity. In a preferred embodiment, the epitope is selected from a 
peptide encoded by a nucleic acid of TableL hi another preferred embodiment, the q)itope is 

1 0 selected from the CBF9 peptide sequence shown in Table 2. 

[117] In one embodiment, the term "antibody" includes antibody fragments, 
as are known in the art, including Fab, Fab2, single chain antibodies (Fv for example), 
chimeric antibodies, etc., either produced by the modification of whole antibodies or those 
synthesized de novo using recombinant DNA technologies. 

15 [118] Methods of preparing polyclonal antibodies are known to the skilled 

artisan. Polyclonal antibodies can be raised in a mammal, for example, by one or more 
injections of an immunizing agent and, if desired, an adjuvant. Typically, the immunizing 
agent and/or adjuvant will be injected in the mammal by multiple subcutaneous or 
intraperitoneal injections. The immunizing agent may include the CBF9 peptide of Table 2, 

20 or a peptide encoded by a nucleic acid of Table 1 or fragment thereof or a fusion protein 
thereof. It may be useful to conjugate the immunizing agent to a protein known to be 
inmiunogenic in the mammal being immunized. Examples of such immunogenic proteins 
include but are not limited to keyhole limpet hemocyanin, serum albumin, bovine 
thyroglobulin, and soybean trypsin inhibitor. Examples of adjuvants which maybe employed 

25 include Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). The immunization protocol may be selected by one 
skilled in the art without undue experimentatioiL 

[1 19] The antibodies may, alternatively, be monoclonal antibodies. 
Monoclonal antibodies may be prepared using hybridoma methods, such as those described 

30 by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, hamster, 
or other appropriate host animal, is typically immunized with an immunizing agent to eUcit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent. Alternatively, the lymphocytes may be immunized in vitro. The 
immunizing agent will typically mclude the CBF9 polypeptide or a peptide encoded by a 
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nucleic acid of Table 1 or a ftagment thereof or a fiision protein thereof Generally, eillier 
peripheral blood lymphocytes ("PBLs") are used if cells of human origin are desired, or 
spleen cells or lymph node cells are used if non-human mammalian sources are desired. The 
lymphocytes are then fused with an immortalized cell line using a suitable fusing agent, such 
5 as polyethylene glycol, to form a hybridoma cell [Coding, Monoclonal Antibodies: Principles 
and Practice, Academic Press, (1986) pp. 59-103]. Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human 
origin. Usually, rat or mouse myeloma cell lines are employed The hybridoma cells may be 
cultured in a suitable culture medium that preferably contains one or more substances that 

10 inhibit the growth or survival of the unfused, immortalized cells. For example, if the parental 
cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), 
the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and 
thymidine ("HAT medium"), which substances prevent tbe growth of HGPRT-deficient cells. 
[120] In one embodiment, the antibodies are bispecific antibodies. 

1 5 Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have 
bmding specificities for at least two different antigens. In the present case, one of the binding 
specificities is for a colorectal cancer protein or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. 

20 [121] In a preferred embodiment, the antibodies to colorectal cancer are 

capable of reducing or eliminating the biological function of colorectal cancer , as is 
described below. That is, the addition of anti-colorectal cancer antibodies (either polyclonal 
or preferably monoclonal) to colorectal cancer (or cells containing colorectal cancer ) may 
reduce or eliminate the colorectal cancer activity. Generally, at least a 25% decrease in 

25 activity is preferred, with at least about 50% being particularly preferred and about a 95- 
100% decrease being especially preferred. 

[122] In a preferred embodiment the antibodies to the colorectal cancer 
proteins are humanized antibodies. Humanized forms of non-human (e.g., murine) antibodies 
are chimeric molecules of immunoglobulins, immunoglobulin chains or firagments thereof 

30 (such as Fv, Fab, Fab*, F(ab')2 or other antigen-binding subsequences of antibodies) which 
contain minimal sequence derived firom non-human immimoglobulin. Himianized antibodies 
include hxunan immunoglobulins (recipient antibody) in which residues form a 
complementary determining region (CDR) of the recipient are replaced by residues from a 
CDR of a non-human species (donor antibody) such as mouse, rat or rabbit having the desired 
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specificity, afBmty and capacity. In some instances, Fv firamework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither ia fhe recipient antibody nor in the 
imported CDR or fi:mnework sequences. In general, the humanized antibody will comprise 
5 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the FR regions are fliose of a human immunoglobulin consensus 
sequence. The humanized antibody optimally also will comprise at least a portion of an 
immunoglobulin constant region (Fc), typically that of a human immunoglobulin [Jones et 

10 al„ Nature, 321 :522-525 (1986); Riechmann et al.. Nature, 332:323-329 (1988); and Presta, 
Curr. Op. Struct. Biol, 2:593-596 (1992)]. 

[123] Methods for humanizing non-himian antibodies are well known in the 
art. Generally, a humanized antibody has one or more amino acid residues introduced into it 
from a source which is non-human. These non-human amino acid residues are often referred 

15 to as import residues, which are typically taken from an import variable domain. 

Humanization can be essentially performed following the method of Winter and co-workers 
[Jones et al.. Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); 
Verhoeyen et al.. Science, 239:1534-1536 (1988)], by substituting rodent CDRs or CDR 
sequences for the corresponding sequences of a human antibody. Accordingly, such 

20 humanized antibodies are chimeric antibodies (U.S. Patent No. 4,816,567), wherein 
substantially less than an mtact human variable domain has been substituted by the 
corresponding sequence from a non-human species. In practice, humanized antibodies are 
typically human antibodies in which some CDR residues and possibly some FR residues are 
substituted by residues from analogous sites in rodent antibodies. 

25 [124] Human antibodies can also be produced using various techniques 

known in the art, including phage display Ubraries [Hoogenboom and Winter, J. Mol. Biol., 
227:381 (1991); Marks et al., J. Mol. Biol, 222:581 (1991)]. The techniques of Cole et al. 
and Boemer et al. are also available for the preparation of human monoclonal antibodies 
(Cole et al., Monoclonal Antibodies and Cancer Ther^y, Alan R. Liss, p. 77 (1985) and 

30 Boemer et al., J. Immunol, 147(l):86-95 (1991)]. Similarly, human antibodies can be made 
by introducing of human immunoglobulin loci into transgenic animals, e.g., mice in which 
the endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
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This ^proach is described, for example, in U.S. Patmt Nos. 5,545,807; 5,545,806; 
5,569,825; 5,625,126; 5,633,425; 5,661,016, and in the foUowing scientific pubhcations: 
Marks et al,, Bio/Technology 10, 779-783 (1992); Lonberg et al.. Nature 368 856-859 (1994); 
Morrison, Nature 368, 812-13 (1994); Fishwild et al.. Nature Biotechnology 14, 845-51 
5 (1996); Neuberger, Nature Biotechnology 14, 826 (1996); Lonberg and Huszar, Litem. Rev. 
ImmunoL 13 65-93 (1995). 

[125] By immunothCTapy is meant treatment of colorectal cancer with an 
antibody raised against colorectal cancer proteins. As used herein, immunotherapy can be 
passive or active. Passive immmiotherapy as defined herein is the passive transfer of 

10 antibody to a recipient (patient). Active immunization is the induction of antibody and/or T- 
cell responses in a recipient (patient). Induction of an immune response is the result of 
providing the recipient with an antigen to which antibodies are raised. As appreciated by one 
of ordinary skill in the art, the antigen may be provided by injecting a polypeptide against 
which antibodies are desired to be raised into a recipient, or contacting the recipient with a 

15 nucleic acid capable of expressing the antigen and under conditions for expression of the 
antigen. 

[126] In a preferred embodiment the colorectal cancer proteins against 
which antibodies are raised are secreted proteins as described above. Without being bound 
by theory, antibodies used for treatment, bind and prevent the secreted protein from binding 

20 to its receptor, thereby inactivating the secreted colorectal cancer protein. 

[127] In another preferred embodiment, the colorectal cancer protein to 
which antibodies are raised is a transmembrane protem. Without being bound by theory, 
antibodies used for treatment, bind the extracellular domain of the colorectal cancer protein 
and prevent it fi-om binding to other proteins, such as circulating ligands or cell-associated 

25 molecules. The antibody may cause down-regulation of the transmembrane colorectal cancer 
protein. As will be appreciated by one of ordinary skill in the art, the antibody may be a 
competitive, non-competitive or uncompetitive inhibitor of protein binding to the 
extracellular domain of the colorectal canc^ protein. The antibody is also an antagonist of 
the colorectal cancer protein. Further, the antibody prevents activation of the transmembrane 

30 colorectal cancer protein. In one aspect, when the antibody prevents flie binding of other 
molecules to the colorectal cancer protein, the antibody prevents growth of the cell. The 
antibody also sensitizes the cell to cytotoxic agents, including, but not limited to TNF-a, 
TNF-p, IL-1, INF-y and IL-2, or chemotherapeutic agents including 5FU, vinblastine. 
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actinomycin D, cisplatin, methotrexate, and die like. In some instances the antibody belongs 
to a sub-type that activates serum complement when complexed with the transmembrane 
protein thereby mediating cytotoxicity. Thus, colorectal cancer is treated by administering to 
a patient antibodies directed against the transmembrane colorectal cancer protein. 
5 [128] In another preferred embodiment, the antibody is conjugated to a 

therapeutic moiety. In one aspect the therapeutic moiety is a small molecule that modulates 
the activity of the colorectal cancer protein. In another aspect the therapeutic moiety 
modulates the activity of molecules associated with or in close proximity to the colorectal 
cancer protein. The therapeutic moiety may inhibit enzymatic activity such as protease or 

1 0 protein kinase activity associated with colorectal cancer . 

(1291 In a preferred embodiment, the therapeutic moiety may also be a 
cytotoxic agent. In this method, targeting the cytotoxic agent to tumor tissue or cells, results 
in a reduction in the number of afflicted cells, thereby reducing symptoms associated with 
colorectal cancer . Cytotoxic agents are nxunerous and varied and include, but are not limited 

IS to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diptheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 
radiochemicals made by conjugating radioisotopes to antibodies raised against colorectal 
cancer proteins, or binding of a radionuclide to a chelating agent that has been covalently 

20 attached to the antibody. Targeting the therapeutic moiety to transmembrane colorectal 
cancer proteins not only serves to increase the local concentration of therapeutic moiety in 
the colorectal cancer afOicted area, but also serves to reduce deleterious side effects that may 
be associated with the therapeutic moiety. 

[1301 hi another preferred embodiment, the colorectal cancer protein against 

25 which the antibodies are raised is an intracellular protein. In this case, the antibody may be 
conjugated to a protein which facilitates entry into the cell. In one case, the antibody enters 
the cell by endocytosis. In another embodiment, a nucleic acid encoding the antibody is 
admmistered to the individual or cell. Moreover, wherein the colorectal cancer protein can 
be targeted within a cell, i.e., the nucleus, an antibody thereto contains a signal for that target 

30 localization, i.e., a nuclear localization signal. 

[131] The colorectal cancer antibodies of the invention specifically bind to 
colorectal cancer proteins. By "specifically bind" herein is meant that the antibodies bind to 
the protein with a binding constant in the range of at least 10*^- 10"^ M'^ with a preferred 
rangebeinglO-^-lO'^NT^ 
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[132] In a preferred embodiment, the colorectal cancer protein is purified or 
isolated after expression. Colorectal cancer proteins may be isolated or purified in a variety 
of ways known to those skilled in the art dq)ending on what other components are present in 
the sample. Standard purification methods include electrophoretic, molecular, 
5 immimological and chromatogr^hic techniques, including ion exchange, hydrophobic, 

affinity, and reverse-phase HPLC chromatography, and chromatofocusing. For example, the 
colorectal cancer protein may be purified using a standard anti-colorectal cancer antibody 
column. Ultrafiltration and diafiltration techniques, in conjunction with protein 
concentration, are also usefiil. For general guidance in suitable purification techniques, see 

10 Scopes, R., Protein Purification, Springer-Verlag, NY (1982). The degree of purification 
necessary will vary dq)ending on the use of the colorectal cancer protein. In some instances 
no purification will be necessary. 

[133] Once expressed and purified if necessary, the colorectal cancer 
proteins and nucleic acids are useful in a number of applications. 

15 [134] In one aspect, the expression levels of genes are determined for 

dijBGarent cellular states in the colorectal cancer phenotype; that is, the expression levels of 
genes in normal colon tissue and in colorectal cancer tissue (and in some cases, for varying 
severities of colorectal cancer that relate to prognosis, as outlined below) are evaluated to 
provide expression profiles. An expression profile of a particular ceU state or point of 

20 development is essentially a "fingerprint" of the state; while two states may have any 
particular gene similarly expressed, the evaluation of a number of genes simultaneously 
allows the generation of a gene expression profile that is unique to the state of the cell. By 
comparing expression profiles of cells in different states, information regarding which genes 
are important (including both up- and down-regulation of genes) in each of these states is 

25 obtained. Then, diagnosis may be done or confiraied: does tissue from a particular patient 
have the gene expression profile of normal or colorectal cancer tissue. 

[135] '"Differential ^ression," or grammatical equivalents as used herein, 
refers to both qualitative as well as quantitative differences in the genes' temporal and/or 
cellular expression patterns within and among the cells. Thus, a differentially expressed gene 

30 can qualitatively have its expression altered, including an activation or inactivation, in, for 
example, normal versus colorectal cancer tissue. That is, genes may be turned on or turned 
off in a particular state, relative to another state. As is apparent to the skilled artisan, any 
comparison of two or more states can be made. Such a qualitatively regulated gene will 
exhibit an expression pattern within a state or cell type which is detectable by standard 
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techniques in one such state or cell type, but is not detectable in both. Alternatively, the 
determination is quantitative in that e}q)ression is increased or decreased; that is, the 
expression of the gene is either upregulated, resulting in an increased amount of transcript, or 
downregulated, resulting in a decreased amount of transcript. The degree to which 
5 expression differs need only be large enough to quantify via standard characterization 

techniques as outlined below, such as by use of Affymetrix GeneChip™ expression arrays, 
Lockhart, Nature Biotechnology, 14:1675-1680 (1996), hereby expressly incorporated by 
reference. Other techniques include, but are not limited to, quantitative reverse transcriptase 
PGR, Northern analysis and RNase protection. As outlined above, preferably the change in 

10 expression (i.e. upregulation or downregulation) is at least about 50%, more preferably at 

least about 100%, more preferably at least about 150%, more preferably, at least about 200%, 
with from 300 to at least 1000% being especially preferred. 

[136] As will be appreciated by those in the art, this may be done by 
evaluation at either the gene transcript, or tiie protein level; that is, the amount of gene 

15 OTipression may be monitored using nucleic acid probes to the DNA or RNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 
gene product itself (protein) can be monitored, for example through the use of antibodies to 
the colorectal cancer protein and standard immunoassays (ELISAs,e tc.) or other techniques, 
including mass spectroscopy assays, 2D gel electrophoresis assays, etc. Thus, the proteins 

20 corresponding to colorectal cancer genes, i.e. those identified as being important in a 
colorectal cancer phenotype, can be evaluated in a colorectal cancer diagnostic test. 

[137] In a preferred embodiment, gene expression monitoring is done and a 
number of genes, i.e. an expression profile, is monitored simultaneously, although multiple 
protein expression monitoring can be done as well. Similarly, these assays may be done on 

25 an individual basis as well. 

[138] In this embodiment, the colorectal cancer nucleic acid probes are 
attached to biochips as outlined herein for the detection and quantification of colorectal 
cancer sequences in a particiilar cell. The assays are fiirther described below in the example. 
[139] In a preferred embodiment nucleic acids encoding the colorectal 

30 cancer protein are detected. Although DNA or RNA encoding the colorectal cancer protein 
may be detected, of particular interest are methods wherein the mRNA encoding a colorectal 
cancer protein is detected. The presence of mKNA in a sample is an indication that the 
colorectal cancer gene has been transoibed to form the mRNA, and suggests that the protein 
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is e:!q)ressed. Probes to detect the mRNA can be any nucleotide/deoxynucleotide probe that 
is complementary to and base pairs with the mRNA and includes but is not limited to 
oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
5 examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specifically bound probe, the label is 
detected. In another method detection of the mRNA is performed in situ. In this method 
penneabilized cells or tissue samples are contacted witii a detectably labeled nucleic acid 
probe for sufl5cient time to allow the probe to hybridize with the target mRNA. Following 
10 washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to ttie mRNA encoding a 
colorectal cancer protein is detected by binding the digoxygenin with an anti-digoxygenin 
secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

1 5 1140] In a preferred embodiment, any of the three classes of proteins as 

described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The colorectal cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing colorectal cancer sequences are used in diagnostic assays. This can be done on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 

20 expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

[141] As described and defined herein, colorectal cancer proteins, including 
intracellxilar, transmembrane or secreted proteins, find use as markers of colorectal cancer . 

25 Detection of these proteins in putative colorectal cancer tissue or patients allows for a 
determination or diagnosis of colorectal cancer . Numerous methods known to those of 
ordinary skill in the art find use in detecting colorectal cancer . In one embodiment, 
antibodies are used to detect colorectal cancer proteins. A preferred method separates 
proteins fix>m a sample or patient by electrophoresis on a gel (typically a denaturing and 

30 reducing protein gel, but may be any other type of gel including isoelectric focusing gels and 
the like). Following separation of proteins, the colorectal cancer protein is detected by . 
immunoblotting with antibodies raised against the colorectal cancer protein. Methods of 
inununoblotting are well known to those of ordinary skill in the art. 
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[142] In another preferred method, antibodies to the colorectal cancer 
protein find use in in situ imaging techniques. In this method cells are contacted with &om 
one to many antibodies to the colorectal cancer protein(s). Following washing to remove 
non-specific antibody binding, the presence of the antibody or antibodies is detected. In one 
5 embodiment the antibody is detected by incubating with a secondary antibody that contains a 
detectable label In another method the primary antibody to the colorectal cancer protein(s) 
contains a detectable label. In another preferred embodiment each one of multiple primary 
antibodies contains a distinct and detectable label. This method finds particular use in 
simultaneous screening for a plurality of colorectal cancer proteins. As will be appreciated 
10 by one of ordinary skill in the art, numerous other histological imaging techniques are useful 
in the invention. 

[143] In a preferred embodiment the label is detected in a fluorometer which 
has the ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

1 5 [144] In another preferred embodiment, antibodies find use in diagnosing 

colorectal cancer firom blood samples. As previously described, certain colorectal cancer 
proteins are secreted/circulating molecules. Blood samples, therefore, are useful as samples 
to be probed or tested for the presence of secreted colorectal cancer proteins. Antibodies can 
be used to detect the colorectal cancer by any of the previously described immimoassay 

20 techniques including ELISA, immunoblotting (Westem blotting), immunoprecipitation, 

BIACORE technology and the like, as will be appreciated by one of ordinary skill in the art. 

[145] In a preferred embodiment, in situ hybridization of labeled colorectal 
cancer nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, 
including colorectal cancer tissue and/or normal tissue, are made. In situ hybridization as is 

25 known in the art can then be done. 

[146] It is imderstood that when comparing the fingerprints between an 
individual and a standard, the skilled artisan can make a diagnosis as well as a prognosis. It 
is further understood that the genes which indicate the diagnosis may differ fi^om those which 
indicate the prognosis. 

30 [147] In a preferred embodiment, the colorectal cancer proteins, antibodies, 

nucleic acids, modified proteins and cells containing colorectal cancer sequences are used in 
prognosis assays. As above, gene expression profiles can be generated that correlate to 
colorectal cancer severity, in terms of long term prognosis. Again, this may be done on 
either a protein or gene level, witii the use of genes being preferred. As above, the colorectal 
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caacer probes are attached to biocbips for the detection and quantification of colorectal 
cancer sequences in a tissue or patient. The assays proceed as outlined for diagnosis. 

[148] In a preferred embodiment, any of tiie three classes of proteins as 
described herein are used in drug screening assays. The colorectal cancer proteins, 
5 antibodies, nucleic acids, modified proteins and cells containing colorectal cancer sequences 
are used in drug screening assays or by evaluating the effect of drug candidates on a "gene 
expression profile" or expression profile of polypeptides. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes after treatment with a candidate 

10 agent, Zlokamik, et al.. Science 279, 84-8 (1998), Heid, 1996 #69. 

[149] In a preferred embodiment, the colorectal cancer proteins, antibodies, 
nucleic acids, modified proteins and cells containing the native or modified colorectal cancer 
proteins are used in screening assays. That is, the present invention provides novel methods 
for screening for compositions which modulate the colorectal cancer phenotype. As above, 

1 5 this can be done on an individual gene level or by evaluating the effect of drug candidates on 
a "gene expression profile", hi a preferred embodiment^ the expression profiles are used, 
preferably in conjunction with high throughput screening techniques to allow monitoring for 
expression profile genes after treatment with a candidate agent, see Zlokamik, supra. 
Having identified the differentially expressed genes herein, a variety of assays may be 

20 executed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene as up regulated in colorectal cancer , 
candidate bioactive agents may be screened to modulate this gene's response; preferably to 
down regulate the gene, although in some circumstances to up regulate the gene. 

[150] The phrase "functional effects" in the context of assays for testing 

25 compounds that modulate activity of a colorectal cancer protein or colorectal cancer nucleic 
acid mcludes the determination of a parameter that is indirectly or directly under the 
influence of a colorectal cancer protein or nucleic acid, e.g. , a physical (direct), or phenotypic 
or chemical effect (indirect^ such as &e ability to increase or decrease cellular proliferation. 
It includes cell cycle arrest, the abiUty of cells to proliferate, and other characteristics of 

30 proliferating cells. *Timctional effects" include in vitro, in vivo, and ex vivo activities. 

[151] By "detennining the fimctional effect" is meant assaying for a 
compoimd that increases or decreases a parameter that is indirectiy or directly imder the 
influence of a colorectal cancer protein or nucleic acid, e,g., physical, phenotypic and 
chemical effects. Such functional effects can be measured by any means known to those 
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skilled in the art, physical effects such as changes in spectroscopic characteristics {e.g.^ 
fluorescence, absoibance, refractive index); hydrodynamic (e.g., shape); chromatographic; or 
solubility properties for the protein; measuring ligand binding activity or binding assays, e.g. 
binding to antibodies; measuring changes in ligand binding activity; and chemical or 
5 phenotypic effects such as measuring inducible markers or transcriptional activation of the 
protein; measming cellular proliferation; measuring cell surface marker expression; 
measurement of changes in protein levels for colorectal cancer-associated sequences; 
measurement of RNA stability; phosphorylation or dephosphorylation; signal transduction, 
e.g., receptor-ligand interactions, second messrager concentrations (e.g., cAMP, IPS, or 

10 intracellular Ca^^); identification of downstream or reporter gene expression (CAT, 

luciferase, p-gal, GFP and the like), e.g., via chemiluminescence, fluorescence, colorimetric 
reactions, antibody binding, and inducible markers. 

[152] 'Inhibitors", "activators", and "modulators" of colorectal cancer 
polynucleotide and polypeptide sequences are used to refer to activating, inhibitory, or 

15 modulating molecules identified using in vitro and in ^nvo assays of colorectal cancer 
polynucleotide and polypeptide sequences. Inhibitors are compoimds tiiiat, e.g., bind to, 
partially or totally block activity, decrease, prevent, delay activation, inactivate, desensitize, 
or down regulate the activity or expression of colorectal cancer proteins or nucleic acids, e.gy 
antagonists. "Activators" are compounds that mcrease, open, activate, facilitate, enhance 

20 activation, sensitize, agonize, or up regulate colorectal cancer protein or nucleic acid activity. 
Inhibitors, activators, or modulators also include genetically modified versions of colorectal 
cancer proteins, e.g, versions with altered activity, as well as naturally occurring and 
synthetic ligands, antagonists, agonists, antibodies, antisense molecules, peptides, ribozymes, 
small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 

25 expressing colorectal cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compoimds, and then determining the fimctional effects on activity, as described 
above. 

[153] Samples or assays comprising colorectal cancer proteins or colorectal 
cancer nucleic acids that are treated with a potential activator, inhibitor, or modulator are 
30 compared to control samples without the inhibitor, activator, or modulator to examine the 
extent of inhibition. Control samples (untreated with inhibitors) are assigned a relative 
activity value of 100%. Inhibition of colorectal cancer is achieved when the activity value 
relative to tiie control is about 80%, preferably 50%, more preferably 25-0%. Activation of 
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colorectal cancer is achieved when the activity value relative to the control (untreated with 
activators) is 110%, more preferably 150%, more preferably 200-500% (i.e., two to five fold 
higher relative to the control), more preferably 1000-3000% higher. 

[154] As will be appreciated by ttiose in tiie art, this may be done by 
5 evaluation at either the gene or the protein level; that is, the amount of gene expression may 
be monitored using nucleic acid probes and the quantification of gene expression levels, or, 
altematively, the gene product itself can be monitored, for example through the use of 
antibodies to the colorectal cancer protein and standard immunoassays. 

[155] In a preferred embodiment, gene expression monitoring is done and a 

10 number of genes, i.e. an expression profile, is monitored simultaneously, although multiple 
protein e:q)ression monitoring can be done as well. 

[156] In this embodiment, the colorectal cancer nucleic acid probes are 
attached to biochips as outlined herein for the detection and quantification of colorectal 
cancer sequences in a particular cell. The assays are fiurther described below. 

1 5 [157] Generally, in a preferred embodiment, a candidate bioactive agent is 

added to the cells prior to analysis. Moreover, screens are provided to identify a candidate 
bioactive agent which modulates colorectal cancer, modulates colorectal cancer proteins, 
binds to a colorectal cancer protein, or interferes between the binding of a colorectal cancer 
protein and an antibody. 

20 [158] The term "candidate bioactive agenf or "test compound" or "drug 

candidate" or "modulator" or grammatical equivalents as used herein describes any molecule, 
either naturally occurring or synthetic, e.g., protein, oligopeptide (e.g., fi:om about 5 to about 
25 amino acids in length, preferably firom about 10 to 20 or 12 to 18 amino acids in length, 
preferably 12, 15, or 18 amino acids in length), small organic molecule, polysaccharide, lipid, 

25 fatty acid, polynucleotide, oligonucleotide, etc., to be tested for the capacity to directly or 
indirectly modulate colorectal cancer sequences, including both nucleic acid and protein 
sequences. The test compound can be in the form of a library of test compounds, such as a 
combinatorial or randomized library tiiat provides a sufficient range of diversity. Test 
compounds are optionally linked to a fusion partner, e.g., targeting compounds, rescue 

30 compounds, dimerization compounds, stabilizing compounds, addressable compounds, and 
other fimctional moieties. Conventionally, new chemical entities with usefiil properties are 
generated by identifying a test compound (called a "lead compoimd") with some desirable 
property or activity, e.g., inhibiting activity, creating variants of the lead compound, and 
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evaluating the property and activity of those variant compounds. Often, high throughput 
screening (HTS) methods are employed for such an analysis. 

[159] hi preferred raibodiments, the bioactive agents modulate the 
expression profiles, or expression profile nucleic acids or protems provided herein. In a 
5 particularly preferred embodiment, the candidate agent suppresses a colorectal cancer 

phenotype, for example to a normal colon tissue fingerprint. Similarly, the candidate agent 
preferably suppresses a severe colorectal cancer phenotype. Generally a plurality of assay 
mixtures are run in parallel with different agent concentrations to obtain a differential 
response to the various concentrations. Typically, one of these concentrations serves as a 

10 negative control, i.e., at zero concentration or below the level of detection. 

[160] In one aspect, a candidate agent will neutralize the effect of a 
colorectal cancer protein. By "neutralize" is meant that activity of a protein is either 
inhibited or counter acted against so as to have substantially no effect on a cell. 

[161] Candidate agents encompass mmierous chemical classes, though 

15 typically they are organic molecules, preferably small organic compounds having a molecular 
weight of more than 100 and less than about 2,500 daltons. Preferred small molecules are 
less than 2000, or less than 1500 or less than 1000 or less than 500 D. Candidate agents 
comprise functional groups necessary for structural interaction with proteins, particularly 
hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl 

20 group, preferably at least two of the functional chemical groups. The candidate agents often 
comprise cyclical carbon or heterocycUc stractures and/or aromatic or polyaromatic 
structures substituted with one or more of the above fimctional groups. Candidate agents are 
also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, 
pyrimidines, derivatives, stmctural analogs or combinations thereof Particularly preferred 

25 are peptides. 

[162] Candidate agents are obtained from a wide variety of sources including 
libraries of synthetic or natural compounds. For example, numerous means are available for 
random and directed synthesis of a wide variety of organic compounds and biomolecules, 
including expression of randomized oUgonucleotides. Alternatively, libraries of natural 
30 compounds m the form of bacterial, fimgal, plant and animal extracts are available or readily 
produced. Additionally, natural or synthetically produced libraries and compounds are 
readily modified through conventional chemical, physical and biochemical means. Known 
pharmacological agents may be subjected to directed or random chemical modifications, such 
as acylation, alkylation, esterification, amidification to produce structural analogs. 
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[163] In a preferred embodiment, the candidate bioactive agents are 
proteins. By **protein" herein is meant at least two covalently attached amino acids, which 
includes proteins, polypeptides, oligopeptides and peptides. The protein may be made up of 
naturally occurring amino acids and peptide bonds, or synthetic peptidomimetic structures. 
5 Thus "amino acid", or "peptide residue", as used herein means both naturally occurring and 
synthetic amino acids. For example, homo-phenylalanine, citrulline and noreleucine are 
considered amino acids for the purposes of the invention. "Amino acid" also includes imino 
acid residues such as proline and hydroxyproline. The side chains may be in either the (R) 
or the (S) configuration. In the preferred embodiment, the amino acids are in the (S) or Lr 

10 configuration. If non-naturally occurring side chains are used, non-amino acid substituents 
may be used, for example to prevent or retard in vivo degradations. 

[164] In a preferred embodiment, the candidate bioactive agents are naturally 
occurring proteins or fi*agments of naturally occurring proteins. Thus, for example, cellular 
extracts containing proteins, or random or directed digests of proteinaceous cellular extracts, 

15 may be used. In this way libraries of procaryotic and eucaryotic proteins may be made for 
screening in the methods of the inventioa Particularly preferred in this embodiment are 
libraries of bacterial, fungal, viral, and mammalian proteins, with the latter being preferred, 
and hxmian proteins being especially preferred. 

[165] In a preferred embodiment, the candidate bioactive agents are peptides 

20 of fiom about 5 to about 30 amino acids, with from about 5 to about 20 amino acids being 
preferred, and from about 7 to about 15 being particularly preferred. The peptides may be 
digests of naturally occurring proteins as is outlined above, random peptides, or *l3iased*' 
random pq)tides. By '"randomized" or grammatical equivalents herein is meant that each 
nucleic acid and peptide consists of essentially random nucleotides and amino acids, 

25 respectively. Since generally these random peptides (or nucleic acids, discussed below) are 
chemically synthesized, they may incorporate any nucleotide or amino acid at any position. 
The synthetic process can be designed to generate randomized proteins or nucleic acids, to 
allow the formation of all or most of tiie possible combinations over the length of the 
sequence, thus forming a library of randomized candidate bioactive proteinaceous agents. 

30 [166] In one embodiment, the library is fiiUy randomized, with no sequence 

preferences or constants at any position, lii a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 
limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, for example, of hydrophobic 
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amino acids, hydrophilic residues, sterically biased (either small or large) residues, towards 
fhe creation of nucleic acid binding domains, flie creation of cysteines, for cross-linking, 
prolines for SH-3 domains, serines, threonines, tyrosines or histidines for phosphorylation 
sites, etc., or to purines, etc. 
5 [167] In a preferred embodiment, the candidate bioactive agents are nucleic 

acids, as defined above. 

[168] As described above generally for proteins, nucleic acid candidate 
bioactive agents may be naturally occurring nucleic acids, random nucleic acids, or "^biased" 
random nucleic acids. For example, digests of procaryotic or eucaryotic genomes may be 

1 0 used as is outlined above for proteins. 

[169] In a preferred embodiment, the candidate bioactive agents are organic 
chemical moieties, a wide variety of which are available in the literature. 

[170] "Antibody" refers to a polypeptide comprising a fi-amework region 
firom an immunoglobulin gene or fragments thereof that specifically binds and recognizes an 

15 antigen. The recognized immunoglobulin genes include tiie kappa, lambda, alpha, gamma, 
delta, epsilon, and mu constant region genes, as well as the myriad umnunoglobulm variable 
region genes. Light chains are classified as either kappa or lambda. Heavy chains are 
classified as gamma, mu, alpha, delta, or epsilon, which in tum define the immunoglobulin 
classes, IgG, IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of 

20 an antibody will be most critical in specificity and affinity of binding. 

[171] An exemplary immxmoglobulin (antibody) structural unit comprises a 
tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "lighf (about 25 kD) and one ^lieavy*' chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 

25 responsible for antigen recognition. The terms variable light chain (Vl) and variable heavy 
chain (V h) refer to these light and heavy chains respectively. 

[172] Antibodies exist, e.g., as intact immunoglobulins or as a number of 
well-characterized firagments produced by digestion with various peptidases. Thus, for 
example, pepsin digests an antibody below the disulfide linkages in the hinge region to 

30 produce F(ab)'2, a dimer of Fab which itself is a light chain joined to Vh-Ch1 by a disulfide 
bond. The F(ab)'2 may be reduced under mild conditions to break the disulfide linkage in the 
hinge region, thereby converting the F(ab)'2 dimer into an Fab' monomer. The Fab' 
monomer is essentially Fab with part of the hinge region (see Fundamental Immunology 
(PavA ed., 3d ed 1993). While various antibody fragments are defined in terms of the 
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digestion of an intact antibody, one of skill will 2q)preciate that such fragments may be 
synthesized de novo either chemically or by using recombinant DNA methodology. Thus, 
tiie term antibody, as used herein, also includes antibody fragments either produced by the 
modification of whole antibodies, or those synthesized de novo using recombinant DNA 
5 methodologies (e.g., single chain Fv) or those identified using phage display libraries {see, 
e.g., McCafferty et al. Nature 348:552-554 (1990)) 

[173] For preparation of antibodies, e.g., recombinant, monoclonal, or 
polyclonal antibodies, many technique known in the art can be used (see, e.g., Kohler & 
Milstein, Nature 256:495-497 (1975); Kozbor et al. Immunology Today 4: 72 (1983); Cole et 

10 a/., pp. 77-96 in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc. (1985); 
Coligan, Current Protocols in Immunology (1991); Harlow & Lane, Antibodies, A 
Laboratory Manual (1988); and Goding, Monoclonal Antibodies: Principles and Practice (2d 
ed. 1986)). The genes encoding the heavy and light chains of an antibody of interest can be 
cloned from a cell, e.g«, the genes encoding a monoclonal antibody can be cloned from a 

15 hybridoma and used to produce a recombinant monoclonal antibody. Gene libraries encoding 
heavy and light chaims of monoclonal antibodies can also be made from hybridoma or 
plasma cells. Random combinations of the heavy and hght chain gene products generate a 
large pool of antibodies with different antigenic specificity {see, e.g., Kuby, Immunology (3^^ 
ed. 1997)). Techniques for the production of single chain antibodies or recombinant 

20 antibodies (U.S. Patent 4,946,778, U.S. Patent No. 4,816,567) can be adapted to produce 
antibodies to polyp^tides of this invention. Also, transgenic mice, or other organisms such 
as other mammals, may be used to express humanized or human antibodies {see, e.g., U.S. 
Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, Marks et al, 
Bio/Technology 10:779-783 (1992); Lonberg et al. Nature 368:856-859 (1994); Morrison, 

25 Nature 368:812-13 (1994); Fishwild et al, Nature Biotechnology 14:845-51 (1996); 
Neuberger, Nature Biotechnology 14:826 (1996); and Lonberg & Huszar, Intern, Rev, 
Immunol 13:65-93 (1995)). Alternatively, phage display technology can be used to identify 
antibodies and heteromeric Fab fragments that specifically bind to selected antigens {see, e.g, 
McCafiferty et al. Nature 348:552-554 (1990); Marks et al. Biotechnology 10:779-783 

30 (1992)). Antibodies can also be made bispecific, i.e., able to recognize two different antigens 
{see, e.g., WO 93/08829, Traunecker et al, EMBOJ. 10:3655-3659 (1991); and Suresh et al, 
Methods in Enzymology 121:210 (1986)). Antibodies can also be heteroconjugates, e.g., two 
covalentiy joined antibodies, or immunotoxins {see, e.g., U.S. Patent No. 4,676,980 , WO 
91/00360; WO 92/200373; and EP 03089). 
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[1 74] Methods for humanizing or primatizing non-hinnan antibodies are well 
known in the art. Generally, a humanized antibody has one or more amino acid residues 
introduced into it fix)m a source which is non-human. These non-human amino acid residues 
are often referred to as import residues, which are typically taken from an import variable 
5 domain. Humanization can be essentially perfomed following the method of Winter and co- 
workers (see, e.g., Jones et al. Nature 321:522-525 (1986); Riechmann et al.. Nature 
332:323-327 (1988); Verhoeyen etal. Science 239:1534-1536 (1988) andPresta, Curr. Op. 
Struct Biol. 2:593-596 (1992)), by substituting rodent CDRs or CDR sequences for the 
coiresponding sequences of a human antibody. Accordingly, such humanized antibodies are 

1 0 chimeric antibodies (U.S. Patent No. 4,8 1 6,567), wherein substantially less than an intact 
human variable domain has bem substituted by the corresponding sequence from a non- 
human species. In practice, humanized antibodies are typically human antibodies in which 
some CDR residues and possibly some FR residues are substituted by residues from 
analogous sites in rodent antibodies. 

15 [175] A "chimeric antibod/' is an antibody molecule in which (a) the 

constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen 
binding site (variable region) is linked to a constant region of a different or altered class, 
effector function and/or species, or an entirely different molecule which confers new 
properties to the chimeric antibody, e.g., an enzyme, toxiQ, hormone, growth factor, drug, 

20 etc.; or (b) the variable region, or a portion thereof, is altered, replaced or exchanged with a 
variable region having a different or altered antigen specificity. 

[176] In one embodiment, the antibody is conjugated to an "effector** moiety. 
The effector moiety can be any number of molecules, including labeling moieties such as 
radioactive labels or fluorescent labels, or can be a ther^eutic moiety. In one aspect the 

25 antibody modulates the activity of the protein. 

[177] After the candidate agent has been added and the cells allowed to 
incubate for some period of time, the sample containing the target sequences to be analyzed is 
added to the biochip. If required, the target sequence is prepared using known techniques. 
For example, the sample may be treated to lyse the cells, using known lysis buffers, 

30 electroporation, etc., with purification and/or amplification such as PGR occurring as needed, 
as will be appreciated by those in the art. For example, an in vitro transcription with labels 
covalently attached to the nucleosides is done. Generally, the nucleic acids are labeled with 
biotin-FTTC or PE, or with cy3 or cy5. 
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[178] In a preferred embodiment, the target sequence is labeled with, for 
example, a fluorescent, a chemiluminescent, a chemical, or a radioactive signal, to provide a 
means of detecting the target sequence's specific binding to a probe. The label also can be an 
enzyme, such as, alkaline phosphatase or horseradish peroxidase, which when provided with 
5 an appropriate substrate produces a product that can be detected. Altematively, the label can 
be a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 
epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 
streptavidin is labeled as described above, thereby, providing a detectable signal for the 
10 bomid target sequence. As known in the art, unbound labeled streptavidin is removed prior to 
analysis. 

[179] As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 
probes, as is generaUy outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

15 5,594,1 17, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 
5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, imder 
conditions that allow the formation of a hybridization complex. 

20 [180] A variety of hybridization conditions may be used in the present 

invention, including high, moderate and low stringency conditions as outlined above. The 
assays are generally run under stringency conditions which allows formation of the label 
probe hybridization complex only in the presence of target. Stringency can be controlled by 
altering a step parameter that is a thermodynamic variable, including, but not limited to, 

25 temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

[181] These parameters may also be used to control non-specific binding, as 
is graerally outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform 
certain steps at higher stringency conditions to reduce non-specific binding. 

30 [182] The reactions outlined herein may be accomplished in a variety of 

ways, as will be appreciated by those in the art. Components of the reaction may be added 
simultaneously, or sequentially, in any order, with preferred embodiments outlined below. In 
addition, the reaction may include a variety of other reagents may be included in the assays. 
These include reagents Uke salts, buffers, neutral proteins, e.g. albumin, detergents, etc which 
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may be used to facilitate optimal hybridization and detection, and/or reduce non-specific or 
backgroimd interactions. Also reagents that otherwise improve the efficiency of the assay, 
such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used, 
dep^ding on the sample preparation methods and purity of the target. 
5 [183] Once tiie assay is run, the data is analyzed to determine the expression 

levels, and changes in expression levels as between states, of individual genes, forming a 
gene expression profile. 

[184] The screens are done to identify dmgs or bioactive agents that 
modulate the colorectal cancer phenotype. Specifically, there are several types of screens 

10 that can be run. A preferred embodiment is in the screening of candidate agents that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. That is, candidate agents that can mimic or produce an expression profile in 
colorectal cancer similar to the expression profile of noraial colon tissue is expected to result 
in a suppression of the colorectal cancer phenotype. Thus, in this embodiment, mimicking an 

1 5 expression profile, or changmg one profile to another, is the goal. 

[185] In a preferred embodiment, as for the diagnosis and prognosis 
applications, having identified the differentially expressed genes important in any one state, 
screens can be run to alter the expression of the genes individually. That is, screening for 
modulation of regulation of expression of a single gene can be done; that is, rather than try to 

20 mimic all or part of an expression profile, screening for regulation of individual genes can be 
done. Thus, for example, particularly in the case of target genes whose presence or absence 
is unique between two states, screening is done for modulators of tiie target gene egression. 

[186] In a preferred embodiment, screening is done to alter the biological 
fimction of the expression product of the differentially expressed gene. Again, having 

25 identified the importance of a gene in a particular state, screening for agents that bind and/or 
modulate the biological activity of the gene product can be run as is more fiilly outiined 
below. 

. [187] Thus, screening of candidate agents that modulate the colorectal cancer 
phenotype either at the gene expression level or the protein level can be done. 
30 [188] In addition screens can be done for novel genes that are mduced in 

response to a candidate agent. After identifying a candidate agent based upon its ability to 
suppress a colorectal cancer expression pattem leading to a nomial expression pattem, or 
modulate a single colorectal cancer gene expression profile so as to mimic the expression of 
the gene fi^om normal tissue, a screen as described above can be performed to identify genes 
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that are specifically modiilated in response to the agent. Comparing expression profiles 
between normal tissue and agent treated colorectal cancer tissue reveals genes that are not 
expressed in normal tissue or colorectal canc^ tissue, but are expressed in agent treated 
tissue. These agent specific sequences can be identified and used by any of the methods 
5 described herein for colorectal cancer genes or proteins. In particular these sequences and 
the proteins they encode find use in marking or identifying agent treated cells. In addition, 
antibodies can be raised against the agent induced proteins and used to target novel 
then^eutics to the treated colorectal cancer tissue sample. 

[189] Thus, in one embodiment, a candidate agent is administered to a 

10 population of colorectal cancer cells, that thus has an associated colorectal cancer 

expression profile. By "administration" or "contacting" herein is meant fliat the candidate 
agent is added to the cells in such a manner as to allow the agent to act upon the cell, whether 
by uptake and intracellular action, or by action at the cell surface. In some embodiments, 
nucleic acid encoding a proteinaceous candidate agent (i.e. a peptide) may be put into a viral 

1 5 constmct such as a retroviral constmct and added to the cell, such that e^qpression of the 
peptide agent is accomplished; see PCT US97/01019, hereby e3q)ressly incorporated by 
reference. 

[190] Once the candidate agent has been administered to the cells, the cells 
can be washed if desired and are allowed to incubate under preferably physiological 

20 conditions for some period of time. The cells are then harvested and a new gene expression 
profile is generated, as outlined herein. 

[191] Thus, for example, colorectal cancer tissue may be screened for 
agents that reduce or suppress the colorectal cancer phenotype, A change in at least one 
gene of the expression profile indicates that the agent has an effect on colorectal cancCT 

25 activity. By defining such a signature for the colorectal cancer phenotype, screens for new 
dmgs that alter the phenotype can be devised. With this approach, the drag target need not be 
known and need not be represented in the original expression screening platform, nor does 
the level of transcript for the target protein need to change. 

[192] In a preferred embodiment, as outlined above, screens may be done on 

30 individual genes and gene products (proteins). That is, having identified a particular 

differentially expressed gene as important in a particular state, screening of modulators of 
either the expression of the gene or the gene product itself can be done. The gene products of 
differentially expressed genes are sometimes referred to herein as "colorectal cancer 
modulator proteins". The colorectal cancer modulator protein may be a firagment, or 
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alternatively, be the full length protein to a fragment shown herein. Preferably, the colorectal 
cancer modulator protein is a fragment of ^proximately 14 to 24 amino acids long. More 
preferably the fragment is a soluble fragment. 

[193] In a preferred embodiment, the fragment is charged and from the c- 
5 terminus. In one embodiment, the c-terminus of the fragment is kept as a free acid and the n- 
terminus is a free amine to aid in coupling, i.e., to cysteine. In another embodiment, the 
fragment is an intemal peptide overlapping hydrophiUc stretch the protein. In a preferred 
embodiment, the termini is blocked In another preferred embodiment, the fragment is a 
novel fragment from the N-terminal. In one embodiment, the fragment excludes sequence 

1 0 outside of the N-terminal, in another embodiment, the fragment includes at least a portion of 
the N-terminal. **N-terminal" is used interchangeably herein with **N-terminus" which is 
fiirther described above. 

[194] In one embodiment the colorectal cancer proteins are conjugated to an 
immunogenic agent as discussed herein, hi one embodiment the colorectal cancer protein is 

15 conjugated to BSA. 

[195] Thus, in a preferred embodiment, screening for modulators of 
expression of specific genes can be done. This will be done as outlined above, but in general 
the expression of only one or a few genes are evaluated. 

[196] In a preferred embodiment, screens are designed to first find candidate 

20 agents that can bind to differentially expressed proteins, and then these agents may be used in 
assays that evaluate the abihty of the candidate agent to modulate differentially expressed 
activity. Thus, as will be appreciated by those in the art, there are a number of different 
assays which may be run; binding assays and activity assays. 

[197] In a preferred embodiment, bmding assays are done. Li general, 

25 purified or isolated gene product is used; that is, the gene products of one or more 

differentially expressed nucleic acids are made. In general, tiiis is done as is known in the art. 
For example, antibodies are generated to the protein gene products, and standard 
immunoassays are run to detemnne the amount of protein present. Alternatively, cells 
comprising die colorectal cancer proteins can be used in the assays. 

30 [198] Thus, in a preferred embodiment, the methods comprise combining a 

colorectal cancer protein and a candidate bioactive agent, and determining the binding of the 
candidate agent to the colorectal cancer protein. Preferred embodiments utilize the human 
colorectal cancer protein, although other mammalian proteins may also be used, for example 
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for the development of animal models of human disease. In some embodiments, as outlined 
herein, variant or derivative colorectal cancer proteins may be used. 

[199] Generally, in a preferred embodiment of the methods herein^ the 
colorectal cancer protein or the candidate agent is non-difiiisably bound to an insoluble 
5 support having isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The 
insoluble supports may be made of any composition to which the compositions can be bound, 
is readily separated from soluble material, and is otherwise compatible with the overall 
method of screening. The surface of such supports may be solid or porous and of any 
convenient shape. Examples of suitable insoluble supports include microtiter plates, arrays, 

10 membranes and beads. These are typically made of glass, plastic (e.g., polystyrene), 
polysaccharides, nylon or nitrocellulose, teflon, etc. Microtiter plates and arrays are 
especially convenient because a large number of assays can be carried out simultaneously, 
using small amounts of reagents and samples. The particular manner of binding of the 
composition is not cmcial so long as it is compatible with the reagents and overall methods of 

15 the invention, maintains the activity of the composition and is nondiffusa^ Preferred 
methods of binding include the use of antibodies (which do not sterically block eith^ the 
Ugand binding site or activation sequence whm the protein is bound to the support), durect 
binding to "sticky" or ionic supports, chemical crosslinking, the synthesis of the protein or 
agent on the surface, etc. Following binding of the protein or agent, excess unbound material 

20 is removed by washing. The sample receiving areas may then be blocked through incubation 
with bovine serum albumin (BSA), casein or other innocuous protein or other moiety. 

[200] In a prefixed embodiment, the colorectal cancer protein is boimd to 
the support, and a candidate bioactive agent is added to the assay. Alternatively, the 
candidate agent is boimd to the support and the colorectal cancer protein is added. Novel 

25 binding agents include specific antibodies, non-natural binding agents identified in screens of 
chemical libraries, peptide analogs, etc. Of particular interest are screening assays for agents 
that have a low toxicity for hmnan cells. A wide variety of assays may be used for this 
purpose, mcluding labeled in vitro protein-protein binding assays, electrophoretic mobility 
shift assays, immunoassays for protein binding, functional assays (phosphorylation assays, 

30 etc.) and the like. 

[201] The determination of the binding of the candidate bioactive agent to 
the colorectal cancer protein may be done in a number of ways. In a preferred embodiment, 
the candidate bioactive agent is labeled, and bmding detemiined directly. For example, Uns 
may be done by attaching all or a portion of the colorectal cancer protein to a solid support. 
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adding a labeled candidate agent (for example a fluorescent label), washing off excess 
reagent, and determining whether the label is present on the solid support. Various blocking 
and washing steps may be utilized as is known m the art 

[202] By "labeled" herein is meant that the compound is either directly or 
5 indirectly labeled with a label which provides a detectable signal, e.g. radioisotope, 

fluorescers, enzyme, antibodies, particles such as magnetic particles, chemilimiinescers, or 
specific binding molecules, etc. Specific binding molecules include pairs, such as biotin and 
streptavidin, digoxin and antidigoxin etc. For the specific binding members, the 
complementary member would normally be labeled with a molecule which provides for 
10 detection, in accordance with known procedures, as outlined above. The label can directly or 
indirectly provide a detectable signal. 

[203] In some embodiments, only one of the components is labeled. For 
example, the proteins (or proteinaceous candidate agents) may be labeled at tyrosine 
positions using 1251, or with fluorophores. Alternatively, more than one component may be 
15 labeled with different labels; using ^^^I for the proteins, for example, and a fluorophor for the 
candidate agents. 

[204] In a preferred embodiment, the binding of the candidate bioactive 
agent is determined through the use of competitive binding assays. In this embodiment, the 
competitor is a binding moiety known to bind to the target molecule (i.e. colorectal cancer ), 

20 such as an antibody, peptide, binding partner, ligand, etc. Under certain circumstances, there 
may be competitive binding as between the bioactive agent and the binding moiety, with the 
binding moiety displacing the bioactive agent. 

[205] In one embodiment, the candidate bioactive agent is labeled. Either 
the candidate bioactive agent, or the competitor, or both, is added first to the protein for a 

25 time sufficient to allow binding, if present. Incubations may be performed at any 

temperature which facilitates optimal activity, typically between 4 and 40°C. Incubation 
periods are selected for optimum activity, but may also be optimized to facilitate rapid high 
fiirough put screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is 
generally removed or washed away. The second component is tiien added, and the presence 

30 or absence of the labeled component is followed, to indicate binding. 

[206] In a preferred embodiment, the competitor is added first, followed by 
the candidate bioactive agent. Displacement of the competitor is an indication that the 
candidate bioactive agent is binding to the colorectal cancer protein and thus is capable of 
binding to, and potentially modulating, the activity of the colorectal cancer protetQ. Li this 



55 



wo 02/21996 



PCTAJSOl/28716 



embodiment, either component can be labeled Thus, for example, if the competitor is 
labeled, the presence of label in the wash solution indicates displacement by the agCTt. 
Alternatively, if the candidate bioactive agent is labeled, the presence of the label on the 
support indicates displacement 
5 [2071 hi an alternative embodiment, the candidate bioactive agent is added 

first, with incubation and washing, followed by the competitor. The absence of binding by 
the competitor may indicate that the bioactive agent is boimd to the colorectal cancer protein 
with a higher afOnity. Thus, if the candidate bioactive agent is labeled, the pres^ce of the 
label on the support, coiq)led with a lack of competitor binding, may indicate that the 

10 candidate ag&at is capable of binding to the colorectal cancer protein. 

[208] In a preferred embodiment, the methods comprise differential 
screening to identity bioactive agents that are capable of modulating the activity of the 
colorectal cancer proteins. In this embodiment, the methods comprise combining a 
colorectal cancer protein and a competitor in a first sample. A second sample comprises a 

15 candidate bioactive agent, a colorectal cancer protein and a competitor. The binding of the 
competitor is determined for both samples, and a change, or difference in binding between 
the two samples indicates the presence of an agent capable of binding to the colorectal 
cancer protein and potentially modulating its activity. That is, if the binding of the 
competitor is different in the second sample relative to the first sample, the agent is capable 

20 of binding to the colorectal cancer protein. 

[209] Alternatively, a preferred embodiment utilizes differential screening to 
identify drug candidates that bind to the native colorectal cancer protein, but cannot bind to 
modified colorectal cancer proteins. The structure of the colorectal cancer protein may be 
modeled, and used in rational dmg design to synthesize agents that interact with that site. 

25 Drug candidates that affect colorectal cancer bioactivity are also identified by screening 
drugs for the ability to either enhance or reduce the activity of the protein. 

[210] Positive controls and negative controls may be used in the assays. 
Preferably all control and test samples are performed in at least triplicate to obtain 
statistically significant results. Incubation of all samples is for a time sufficient for the 

30 binding of flie agent to the protein. Following incubation, all samples are washed free of non- 
specifically bound material and the amount of bound, generally labeled agent determined. 
For example, where a radiolabel is employed, the samples may be counted in a scintillation 
counter to determine the amount of bound compound 
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[21 1] A variety of other reagents may be included in the screening assays. 
These include reagents like salts, neutral proteins, e.g. albumin, detergents, etc which may be 
used to facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
5 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in any order that provides for the requisite binding. 

[212] Screening for agents that modulate the activity of colorectal cancer 
proteins may also be done. In a preferred embodiment, methods for screening for a bioactive 
agent capable of modulating the activity of colorectal cancer proteins comprise the steps of 

10 adding a candidate bioactive agent to a sample of colorectal cancer proteins, as above, and 
determining an alteration in the biological activity of colorectal cancer proteins. 
'TVIodxilating the activity of colorectal cancer " includes an increase in activity, a decrease in 
activity, or a change in the type or kind of activity present. Thus, in this embodiment, the 
candidate agent should both bind to colorectal cancer proteins (although this may not be 

1 5 necessary), and alter its biological or biochemical activity as defined herein. The methods 
include both in vitro screening methods, as are generally outlined above, and in vivo 
screening of cells for alterations in the presence, distribution, activity or amount of colorectal 
cancer proteins. 

[213] Thus, in this embodiment, the methods comprise combining a 
20 colorectal cancer sample and a candidate bioactive agent, and evaluating the effect on 

colorectal cancer activity. By "colorectal cancer activity*' or grammatical equivalents herein 
is meant one of the colorectal cancer 's biological activities, including, but not limited to, cell 
division, preferably in colon tissue, cell proliferation, tumor growth, transformation of cells. 
In one embodiment, colorectal cancer activity includes activation of a gene identified by a 
25 nucleic acid of Table 1 . An inhibitor of colorectal cancer activity is the inhibition of any one 
or more colorectal cancer activities. 

[214] In a preferred embodiment, the activity of the colorectal cancer protein 
is increased; in another preferred embodiment, the activity of the colorectal cancer protein is 
decreased. Thus, bioactive agents that are antagonists are preferred in some embodiments, 
30 and bioactive agents that are agonists may be preferred in other embodiments. 

[215] In a preferred embodiment, the invention provides methods for 
screening for bioactive agents capable of modulating the activity of a colorectal cancer 
protein. The methods comprise adding a candidate bioactive agent, as defined above, to a 
cell comprising colorectal cancer proteins. Preferred cell types include almost any cell. The 
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cells contain a recombinant nucleic acid that OTCodes a colorectal cancer protein. In a 
preferred embodiment, a library of candidate agents are tested on a plurality of cells. 

[216] Li one aspect, the assays are evaluated in the presence or absence or 
previous or subsequent exposure of physiological signals, for example hormones, antibodies, 
5 peptides, antigens, cytokines, growth factors, action potentials, pharmacological agents 
including chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). 
In another example, the determinations are determined at different stages of the cell cycle 
process. 

[217] In this way, bioactive agents are idditified. Compounds with 

10 pharaiacological activity are able to enhance or interfere wifli the activity of the colorectal 
cancer protein. In one embodiment, "colorectal cancer protein activity*' as used herein 
includes at least one of the following: colorectal cancer activity, binding to the colorectal 
cancer protein, activation of the colorectal cancer protein or activation of substrates of the 
colorectal cancer protein by the colorectal cancer protein. In one embodiment, colorectal 

15 cancer activity is defined as the xmregulated proliferation of colon tissue, or the growth of 
cancer in colon tissue. In one aspect, colorectal cancer activity as defined herein is related to 
the activity of the colorectal cancer protein in the upregulation of the colorectal cancer 
protein in colon cancer tissue. 

[218] In another embodiment, colorectal cancer protein activity includes at 

20 least one of the following: colorectal cancer activity, binding to the CBF9 nucleic acid or 
poly peptide of Table 2 or binding toa nucleic acid of Table 1, or a peptide encoded by a 
nucleic acid of Table 1 or activation of substrates of the gene products identified by a nucleic 
acid of Table 1 or substrates of CBF9, which is shown in Table 2. In one aspect, colorectal 
cancer activity as defined herein is related to the activity of genes defined by the nucleic acids 

25 of Table 1 or of CBF9 as defined in Table 2, in colon cancer tissue. 

[219] In one embodiment, a method of inhibiting colon cancer cell division is 
provided. The method comprises administration of a colorectal cancer inhibitor. 

[220] In anofh^ embodiment, a metiiod of inhibiting tumor growth is 
provided. The method comprises administration of a colorectal cancer inhibitor. 

30 [221] In a fiirther embodiment, methods of treating cells or mdividuais with 

cancer are provided. The method comprises administration of a colorectal cancer inhibitor. 

[222] In one embodiment, a colorectal cancer inhibitor is an antibody as 
discussed above. In another embodiment, the colorectal cancer inhibitor is an antisense 
molecule. Antisense molecules as used herein include antisense or sense oligonucleotides 
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comprising a singe-stranded nucleic acid sequmce (either RNA or DNA) capable of binding 
to target mRNA (sense) or DNA (antisense) sequences for colorectal cancer molecules. A 
preferred antisense molecule is for tiie colorectal cancer sequences referenced in Table 1 or 
Table 2, or for a ligand or activator thereof Antisense or sense oligonucleotides, according 
5 to the present invention, comprise a fragment generally at least about 14 nucleotides, 
preferably from about 14 to 30 nucleotides. The ability to derive an antisense or a sense 
oligonucleotide, based upon a cDNA sequence encoding a given protein is described in, for 
example, Stein and Cohen (Cancer Res. 48:2659, 1988) and van der Krol et al. 
(BioTechniques 6:958, 1988). 

10 [223] Antisense molecules may be introduced into a cell containing the target 

nucleotide sequence by formation of a conjugate with a ligand binding molecule, as described 
in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, cell 
surface receptors, growth factors, other cytokines, or other ligands that bind to cell surface 
receptors. Preferably, conjugation of the hgand binding molecule does not substantially 

15 interfere with the ability of the ligand binding molecule to bind to its corresponding molecule 
or receptor, or block entry of the sense or antisense oligonucleotide or its conjugated version 
into the cell. Alternatively, a sense or an antisense oligonucleotide may be introduced into a 
cell containing the target nucleic acid sequence by formation of an oligonucleotide-lipid 
complex, as described in WO 90/10448. It is understood that the use of antisense molecules 

20 or knock out and knock in models may also be used in screening assays as discussed above, 
in addition to methods of treatment. 

[224] The compounds having the desired pharmacological activity may be 
administered in a physiologically acceptable carrier to a host, as previously described. The 
agents may be administered in a variety of ways, orally, parenterally e.g., subcutaneously, 

25 intraperitoneally, intravascularly, etc. Depending upon the maimer of introduction, the 
compounds may be formulated in a variety of ways. The concentration of therapeutically 
active compound in the formulation may vary from about 0.1-100 wt.%. The agents may be 
administered alone or in combination with other treatments, i.e., radiation. 

[225] Thp pharmaceutical compositions can be prepared in various forms, 

30 such as granules, tablets, pills, suppositories, capsules, suspensions, salves, lotions and flie 
like. Pharmaceutical grade organic or inorganic carriers and/or diluents suitable for oral and 
topical use can be used to make up compositions containing the therapeutically-active 
compounds. Diluents known to the art include aqueous media, vegetable and animal oils and 
&ts. Stabilizing agents, wetting and emulsifying agents, salts for varying the osmotic 
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pressure or buffers for securing an adequate pH value, and skin penetration enhance can be 
used as auxiliary agents. 

[226\ . Without being bound by theory, it q)pears that the various colorectal 
cancer sequences are important in colorectal cancer . Accordingly, disorders based on 
5 mutant or variant colorectal cancer genes may be determined. In one embodiment, the 
invention provides methods for identifying cells containing variant colorectal cancer genes 
comprising determining all or part of the sequence of at least one endogeneous colorectal 
cancer genes in a cell. As will be appreciated by those in the art, this may be done using any 
number of sequencing techniques. In a preferred embodiment, the invention provides 

10 methods of identifymg the colorectal cancer genotype of an individual comprising 
determining all or part of the sequence of at least one colorectal cancer gene of the 
individual. This is generally done in at least one tissue of the individual, and may include the 
evaluation of a number of tissues or different samples of the same tissue. The method may 
include comparing the sequence of the sequenced colorectal cancer gene to a known 

15 colorectal cancer gene, Le. a wild-type gene. 

[227] The sequence of all or part of the colorectal cancer gene can then be 
compared to the sequence of a known colorectal cancer gene to determine if any differences 
exist. This can be done using any number of known homology programs, such as Bestfit, etc. 
In a preferred embodiment, the presence of a a difference ia the sequence between the 

20 colorectal cancer gene of the patient and the known colorectal cancer gene is indicative of a 
disease state or a propensity for a disease state, as outlined herein. 

[228] In a preferred embodiment, the colorectal cancer genes are used as 
probes to determine the number of copies of the colorectal cancer gene in the genome. 

[229] In another preferred embodiment colorectal cancer genes are used as 

25 probed to determine the chromosomal localization of the colorectal cancer genes. 

Information such as chromosomal localization finds use in providiag a diagnosis or prognosis 
in particular when chromosomal abnormalities such as translocations, and the like are 
identified in colorectal cancer gene loci. 

[230] Thus, in one embodiment, methods of modulating colorectal cancer in 

30 cells or organisms are provided. In one embodiment, the methods comprise administering to 
a cell an anti-colorectal cancer antibody that reduces or eliminates the biological activity of 
an endogeneous colorectal cancer protein. Alternatively, the methods comprise 
administering to a cell or organism a recombinant nucleic acid encoding a colorectal cancer 
protein. As will be appreciated by those in the art, this may be accomplished in any number 

60 



wo 02/21996 



PCTAJSOl/28716 



of ways. In a preferred embodiment, for example when the colorectal cancer sequence is 
down-regulated in colorectal cancer , the activity of the colorectal cancer gene is increased 
by increasing the amount of colorectal cancer in tiie cell, for example by overexpressing the 
endogeneous colorectal cancer or by administering a gene encoding the colorectal cancer 
5 sequence, using known gene-therapy techniques, for example. In a preferred embodiment, 
the gene ther^y techniques include the incorporation of the erogenous gene using enhanced 
homologous recombination (EHR), for example as described in PCT/US93/03868, hereby 
incorporated by reference in its entirety. Alternatively, for example when the colorectal 
cancer sequence is up-regulated in colorectal cancer , the activity of the endogeneous 

10 colorectal cancer gene is decreased, for example by the administration of a colorectal cancer 
antisense nucleic acid. 

[231] In one embodiment, the colorectal cancer proteins of the present 
invention may be used to generate polyclonal and monoclonal antibodies to colorectal cancer 
proteins, which are useful as described herein. Similarly, the colorectal cancer proteins can 

15 be coupled, using standard technology, to affinity chromatogn^hy columns. These columns 
may then be used to purify colorectal cancer antibodies. In a preferred embodiment, the 
antibodies are generated to epitopes unique to a colorectal cancer protein; that is, the 
antibodies show little or no cross-reactivity to other proteins. These antibodies find use in a 
number of applications. For example, the colorectal cancer antibodies may be coupled to 

20 standard affinity chromatography colimms and used to pxirify colorectal cancer proteins. The 
antibodies may also be used as blocking polypeptides, as outlined above, since they will 
specifically bind to the colorectal cancer protein. 

[232] In one embodiment, a therapeutically efifective dose of a colorectal 
cancer or modulator thereof is administered to a patient. By "therapeutically efifective dose" 

25 herein is meant a dose that produces the effects for which it is administered. The exact dose 
will depend on the purpose of the treatment, and will be ascertainable by one skilled in the art 
using known techniques. As is known in the art, adjustments for colorectal cancer 
degradation, syst^c versus localized delivery, and rate of new protease synthesis, as well as 
the age, body weight, general health, sex, diet, time of administration, dmg interaction and 

30 the severity of flie condition may be necessary, and will be ascertainable with routine 
e}q)erimentation by those skilled in the art. 

[233] A "patient" for the purposes of the present invention includes both 
humans and other animals, particularly mammals, and organisms. Thus the methods are 
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applicable to both human therapy and veterinary applications. In the preferred embodiment 
the patient is a mammal, and in the most preferred embodiment the patient is hmnan. 

[234] The administration of the colorectal cancer proteins and modulators 
of the present invention can be done in a variety of ways as discussed above, including, but 
5 not limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 

intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 
some instances, for example, in the treatment of wounds and inflammation, the colorectal 
cancer proteins and modulators may be directly applied as a solution or spray. 

[235] The pharmaceutical compositions of the present invention comprise a 

10 colorectal cancer protein in a form suitable for administration to a patient. In the preferred 
embodiment, the pharmaceutical compositions are in a water soluble form, such as being 
present as pharmaceutically acceptable salts, which is meant to include both acid and base 
addition salts. *Tharmaceutically acceptable acid addition salt" refers to those salts that retain 
the biological effectiveness of the free bases and that are not biologically or otherwise 

15 undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 

sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyravic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 

20 'Thannaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, Uthium, ammonium, calcium^ magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammoniiun, 
potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 

25 substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and edxanolamine. 

[236] The pharmaceutical compositions may also include one or more of the 
following: carrier proteins such as serum albumin; buffers; fillers such as microciystalline 

30 cellulose, lactose, com and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. Additives are well known in the art, and 
are used in a variety of formulations. 

[237] In a preferred embodiment, colorectal cancer proteins and modulators 
are administered as therapeutic agents, and can be formulated as outlined above. Similarly, 
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colorectal cancer genes (including both the Mi-length sequence, partial sequences, or 
regulatory sequences of the colorectal cancer codmg regions) can be administered in gene 
therapy applications, as is known in the art. These colorectal cancer graes can include 
antisense applications, either as gene therapy (i.e. for incoiporation into the genome) or as 
5 antisense compositions, as will be appreciated by those in the art. 

[238] In a preferred embodiment, colorectal cancer genes are administered 
as DNA vaccines, either single genes or combinations of colorectal cancer genes. Naked 
DNA vaccines are generally known in the art. Brower, Nature Biotechnology, 16:1304-1305 
(1998). 

10 [239] In one embodiment, colorectal cancer genes ofthe present invention 

are used as DNA vaccines. Methods for the use of genes as DNA vaccines are well Imown to 
one of ordinary skill in the art, and include placing a colorectal cancer gene or portion of a 
colorectal cancer gene under the control of a promoter for expression in a colorectal cancer 
patient. The colorectal cancer gene used for DNA vaccines can encode full-length colorectal 

15 cancer proteins, but more preferably encodes portions ofthe colorectal cancer proteins 
including peptides derived from the colorectal cancer protein. In a preferred embodiment a 
patient is immunized with a DNA vaccine comprising a plurality of nucleotide sequences 
derived from a colorectal cancer gene. Similarly, it is possible to immunize a patient with a 
plurality of colorectal cancer genes or portions thereof as defined herein. Without being 

20 bound by theory, expression of the polypeptide encoded by the DNA vaccine, cytotoxic T- 
cells, helper T-cells and antibodies are induced which recognize and destroy or eliminate 
cells expressing colorectal cancer proteins. 

[240] In a preferred embodiment, the DNA vaccines include a gene encoding 
an adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 

25 increase the immunogenic response to the colorectal cancer polypeptide encoded by the 

DNA vacciae. Additional or altemative adjuvants are known to those of ordinary skill in the 
art and find use in the invention. 

[241] In another preferred embodiment colorectal cancer genes find use in 
generating animal models of colorectal cancer . As is appreciated by one of ordinary skill in 

30 the art, when the colorectal cancer gene identified is repressed or diminished in colorectal 
cancer tissue, gene therapy technology wherein antisense RNA directed to the colorectal 
cancer gene will also diminish or repress expression ofthe gene. An animal generated as 
such serves as an animal model of colorectal cancer that finds use in screening bioactive 
drug candidates. Similarly, gene knockout technology, for example as a result of 
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homologous recombination with an appropriate gene targeting vector, will result in the 
absence of the colorectal cancer protein. When desired, tissue-specific expression or 
knockout of the colorectal cancer protein may be necessary. 

[242] It is also possible that the colorectal cancer protein is overexpressed in 
5 colorectal cancer. As such, transgenic animals can be generated that overexpress the 

colorectal cancer protein. Depending on the desired expression level, promoters of various 
strengths can be employed to e3q>ress the transgene. Also, the number of copies of the 
integrated transgene can be determined and compared for a determination of the expression 
level of the transgene. Animals generated by such methods find use as animal models of 
10 colorectal cancer and are additionally usefiil in screening for bioactive molecules to treat 
colorectal cancer . 

EXAMPLES 

[243] It is imderstood that the examples described herein in no way serve to 
15 limit the true scope of this invention, but rather are presented for illustrative piii^ All 
references and sequences of accession numbers cited herein are incorporated by reference in 
their entirety. 

[244] Example 1 

Tissue Preparation, Labeling Chips, and Fingerprints 

20 

[245] Purify total RNAfirom tissue usmgTRIzol Reagent 
[246] Estimate tissue weight. Homogenize tissue samples in 1ml of TRI2X>1 
per 50mg of tissue using a Polytron 3100 homogenizer. The generator/probe used depends 
upon the tissue size. A generator that is too large for the amount of tissue to be homogenized 
25 will cause a loss of sample and lower RNA yield. Use the 20mm generator for tissue 

weighing more than 0.6g, If the working volume is greater than 2ml, then homogenize tissue 
in a 15ml polypropylene tube (Falcon 2059). Fill tube no greater than 10ml. 



30 HOMOGENIZATXON 

[247] Before using generator, it should have been cleaned after last usage by 
running it through soapy H20 and rinsing thoroughly. Run through with EtOH to sterilize. 
Keep tissue frozen until ready. Add TRIzol directiy to frozen tissue then homogenize. 
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[248] Following homogenization, remove insoluble material from the 
homogenate by centrifiigation at 7500 x g for 15 min. in a Sorvall superspeed or 12,000 x g 
for 10 min. in an Eppendorf centrifuge at 4oC. Transfer the cleared homogenate to a new 
tube(s). The samples may be frozen now at -60 to -70oC (and kept for at least one month) or 
5 you may continue with the purification. 

PHASE SEPARATION 
[249] Incubate the homogenized samples for 5 minutes at room temperature. 
[250] Add 0.2ml of chloroform per 1ml of TRIzoI reagent used in the 
1 0 original homogenization. 

[251] Cap tubes securely and shake tubes vigorously by hand (do not vortex) 

for 15 seconds. 

[252] Incubate samples at room temp, for 2-3 minutes. Centrifuge samples 
at 6500ipm in a Sorvall superspeed for 30 min. at 4oC. (You may spin at up to 12,000 x g 
15 for 10 min. but you risk breaking your tubes in the centrifuge.) 

RNA PRECIPITATION 
[253] Transfer the aqueous phase to a fresh tube. Save the organic phase if 
isolation of DNA or protein is desired. Add 0.5ml of isopropyl alcohol per 1ml of TRIzol 
20 reagent used in the original homogenization. Cap tubes securely and invert to mix. Incubate 
samples at room temp, for 10 minutes. Centrifuge samples at 6500rpm in Sorvall for 20min. 
at4oC. 

RNA WASH 

25 [254] Pour off the supemate. Wash pellet with cold 75% ethanol. Use 1ml 

of 75% ethanol per 1ml of TRIzol reagent used in the initial homogenization. Cap tubes 
securely and invert several times to loosen pellet. (Do not vortex). Centrifuge at <8000rpm 
(<7500 X g) for 5 minutes at 4oC. 

[255] Pour off the wash. Carefidly transfer pellet to an eppendorf tube (let it 

30 sUde down the tube into the new tube and use a pipet tip to help guide it in if necessary). 

Depending on the volumes you are working with, you can decide what size tube(s) you want 
to precipitate the RNA in. When I tried leaving the RNA in the large 15ml tube, it took so 
long to dry (i.e. it did not dry) that I eventually had to transfer it to a smaller tube. Let pellet 
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dry in hood. Resuspend KNA in an appropriate volume of DBPCH20. Try for2-5ug/iil. 
Take absorbance readings. 

[256] Purify poly A+mKNAfix)m total RNA or clean up total 
5 Qiagen' s RNeasy kit 

[257] Purification ofpolyA-fmRNA from total RNA. Heatoligotex 
suspension to 37oC and mix inunediately before adding to RNA. Incubate Elution Buffer at 
70oC. Warm up 2 X Binding Buffer at 65oC if there is precipitate in the buffer. Mix total 
RNA with DEPC-treated water, 2 x Binding Buffer, and Oligotex according to Table 2 on 
10 page 16 of the Oligotex Handbook. Incubate for 3 minutes at 6SoC. Incubate for 10 minutes 
at room temperature. 

1258] Centrifuge for 2 minutes at 14,000 to 1 8,000 g. If centrifuge has a 
"soft setting," then use it Remove supernatant without disturbing Oligotex pellet. A little bit 
IS of solution can be left behind to reduce the loss of Oligotex. Save sup until certaui that 
satisfactory binding and elution of poly A+ mRNA has occurred. 

[259] Gently resuspend in Wash Buffer 0W2 and pipet onto spin column. 
Centrifuge the spin column at full speed (soft setting if possible) for 1 minute. 

20 

[260] Transfer spin column to a new collection tube and gently resuspend in 
Wash Buffer 0W2 and centrifuge as describe herein. 

[261] Transfer spin column to a new tube and elute with 20 to 100 ul of 
25 preheated (70oC) Elution Buffer. Gently resuspend Oligotex resin by pipetting up and down. 
Centrifuge as above. Repeat elution with fresh elution buffer or use first eluate to keep the 
elution volume low. 

[262] Read absorbance, using diluted Elution Buffer as the blank. 

30 

[263] Before proceeding with cDNA synthesis, the mRNA must be 
precipitated. Some component leftover or in the Elution Buffer from the Oligotex 
purification procedure will inhibit downstream enzymatic reeictions of the mRKA. 
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Ethanol Precipitation 
[264] Add 0.4 vol. of 7.5 M NH40Ac + 2.5 vol. of cold 100% ethanol. 
Precipitate at -20oC 1 hour to overnight (or 20-30 min. at -70oC). Centrifuge at 14,000- 
16,000 X g for 30 minutes at 4oC. Wash pellet with 0.5ml of 80%ethanol (-20oC) then 
5 centrifuge at 14,000-16,000 x g for 5 minutes at room temperature. Repeat 80% ethanol 
wash. Dry the last bit of ethanol from the pellet in the hood. (Do not speed vacuum). 
Suspend pellet in DEPC H20 at lug/ul concentration. 

Clean up total RKA using Qiagen's RNeasy kit 
10 [265] Add no more than lOOug to an RNeasy column. Adjust sample to a 

. volume of lOOul with RNase-free water. Add 350ul Buffer RLT then 250ul ethanol (100%) 
to the sample. Mix by pipetting (do not centrifuge) then apply sample to an RNeasy mini 
spincolmnn. Centrifuge for 15 sec at >10,000rpm. If concerned about yield, re-apply 
flowthrough to column and centrifuge again. 
15 [266] Transfer column to a new 2-ml collection tube. Add 500ul Buffer RPE 

and centrifuge for 15 sec at >10,000rpm. Discard flowthrough. Add 500ul Buffer RPE and 
centrifuge for 15 sec at >10,000rpm. Discard flowthrough then centrifuge for 2 min at 
maximum speed to dry column membrane. Transfer column to a new 1.5 -ml collection tube 
and apply 30-50ul of RNase-free water directly onto column membrane. Centrifuge 1 min at 
20 >10,000rpm. Repeat elution. 

[267] Take absorbance reading. If necessary, ethanol precipitate with 
ammonium acetate and 2.5X volume 100% ethanol. 

[268] Make cDNA using Gibco's "Superscript Choice System for cDNA 

25 Synthesis" kit 

First Strand cDNA Svnthesis 
[269] Use 5ug of total RNA or lug of polyA+ mRNA as starting material. 
For total RNA, use 2ul of Superscript RT. For polyA+ mRNA, use lul of Superscript RT. 
Final volume of first strand synthesis mix is 20ul. RNA must be iu a volume no greater than 
30 lOul. Incubate RNA with lul of lOOpmol T7-T24 oligo for 10 min at 70C. On ice, add 7 ul 
of: 4ul5Xlst Strand Buffer, 2ulof0.1MDTr, and lul of lOmMdNTP mix. Incubate at 
37C for 2 min then add Superscript RT 
Incubate at 37C for 1 hour. 
Second Strand Synthesis 
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Place 1st strand reactions on ice. 
Add: 91ulDEPCH20 
30ul 5X 2nd Strand Buffer 
BullOmMdNTPmix 
5 lul lOU/ul E.coli DNA Ligase 

4ul 1 OU/ul E.coli DNA Polymerase 
lul2U/ulRNaseH 

[270] Make the above into a mix ifthere are more than 2 samples. Mix and 
10 incubate 2 hours at 16C. 

[2711 Add 2ulT4 DNA Polymerase. Incubate 5 min at 16C. AddlOulof 

0.5MEDTA 

[272] Clean up cDNA 

15 [273] Phenol:Chloroform:Isoamyl Alcohol (25:24:1) purification using 

Phase-Lock gel tubes: 

[274] Centrifuge PLG tubes for 30 sec at maximimi speed. Transfer cDNA 
mix to PLG tube. Add equal volume of phenol:chloroform:isamyl alcohol and shake 
vigorously (do not vortex). Centrifiige 5 minutes at maximum speed. Transfer top aqueous 

20 solution to a new tube. Ethanol precipitate: add 7.5X 5M NH40ac and 2.5X volume of 

100% ethanol. Centrifuge immediately at room temp, for 20 min, maximimi speed. Remove 
sap then wash pellet 2X with cold 80% ethanol. Remove as much ettianol wash as possible 
then let pellet air dry. Resuspend pellet in 3ul RNase-firee water. 

25 In vitro Transcription (IVT) and labeling with biotin 

Pipet 1.5ul of cDNA into a thin-wall PCR tube. 

Make NTP labeling mix: 

Combine at room tenq>erature: 2ul T7 lOxATP (75niM) (Ambion) 
30 2ul T7 lOxGTP (75mM) (Ambion) 

1.5ul T7 lOxCTP (75mM) (Ambion) 
1.5ul T7 lOxUTP (75mM) (Ambion) 

3.75ul lOmM Bio-1 1-UTP (Boehringer-Mannheim/Roche or Enzo) 
3.75ul i0mMBio-16-CTP(Enzo) 
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2ul 1 Ox T7 transaiption buflfer ( Ambion) 

2iil lOx T7 enzyme mix (Ambion) 

[275] Final volume of total reaction is 20ul. Incubate 6 hours at 37C in a 

5 PGR machine. 



RNeasy clean-up of IVT product 
[276] Follow previous instructions for RNeasy columns or refer to Qiagen's 
RNeasy protocol handbook. 

10 

[277] cRNA will most likely need to be ethanol precipitated, Resuspend in 
a volmne compatible with the fragmentation step. 

Fragmentation 

[278] 15 ugoflabeledRNA is usually fragmented. Try to minimize tiie 
fragmentation reaction volume; a 10 ul volume is recommended but 20 ul is all right. Do not 
go higher than 20 ul because the magnesium in the fragmentation buffer contributes to 
precipitation in the hybridization buffer. 

[279] Fragment RNA by incubation at 94 C for 35 minutes in 1 x 
Fragmentation buffer. 

5 X Fragmentation buffer: 
200 mM Tris-acetate, pH 8. 1 
500mMKOAc 
25 150mMMgOAc 



30 



[280] The labeled RNA transcript can be analyzed before and afrer 
fragmentation. Samples can be heated to 65C for 15 minutes and electrophoresed on 1% 
agarose/TBE gels to get an approximate idea of the transcript size range 

Hybridization 

[281] 200 ul (lOug cRNA) of a hybridization mix is put on the chip. If 
multiple hybridizations are to be done (such as cycling through a 5 chip set), then it is 
recommended that an initial hybridization mix of 300 ul or more be made. 
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Hybrization Mix: fragment labeled RNA (SOngAil final cone.) 

50 pM 948-b control oligo 

LSpMBioB 

SpMBioC 

25pMBioD 

lOOpMCRE 

0. Img/ml herring sperm DNA 

0.5mg/ml acetylated BSA 

to 300 ul with IxMES hyb. buffer 

[282] The instruction manuals for the products used herein are mcorporated 
herein in their entirety. 

Labeling Protocol Provided Herein 
Hybridization reaction: 

Start with non-biotinylated IVT (purified by RNeasy columns) 
(see example 1 for steps from tissue to IVT) 



rVT antisense RNA; 4 \ig: 


lil 


Random Hexamers (1 ^g/|il): 4 |xl 


H20: 




14^1 




- iQCubate 70°C, 10 min. Put on ice. 


Revise transcription: 




5X First Strand (BRL) buffer; 


: 6^1 


O.IMDTT: 


3^1 


SOXdNTPmix: 


0.6^1 


H20: 


2.4 |xl 


Cy3orCy5dUTP(lmM): 


3pi 


SSRTn(BRL): 






16^1 
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- Add to hybridization reaction. 

- Incubate 30 min., 42°C. 

- Add 1 ^1 SSn and let go for another hour. 
Put on ice. 

- SOX dNTP mix (25niM of cold dATP, dCTP, and dGTP, lOmM of dTTP: 25 
^1 each of lOOmM dATP, dCTP, and dGTP; 10 \il of lOOmM dTTP to 15 H20. dNTPs 
from Pharmacia) 

RNA degradation; 
86^1H20 

- Add 1.5 nl IM NaOH/. 2mM EDTA, incubate at 65°C, 10 min. 
lOnllONNaOH 

4^150mMEDTA 
U-Con 30 

500 \il TE/sample spin at 7000g for 10 min, save flow through for purification 
Oiagen purification; 

-suspend u-con recovered material in 500pl buffer PB 
-proceed w/ normal Qiagen protocol 
DNAse digest: 

- Add 1 of 1/100 da of DNAse/30jd Rx and incubate at 37°C for 15 min. 
-5min95°C to denature enzyme 

Sample preparation; 

- Add: 

Cot-1 DNA: 10 ^1 

50X dNTPs: 1 

Na pyro phosphate: 7.5 ^1 

lOmg/ml Herring sperm DNA lul of 1/10 dilution 

21.8 final vol. 

- Dry down in speed vac. 

- Resuspend in 1 5 |il H20. 

- Add 0.38^1 10% SDS. 
-Heat95"C,2min. 
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- Slow cool at room temp, for 20 miiL 

Put on slide and hybridize overnight at 64r^C. 
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5 Washing after the hybridization; 

3X SSC/0.03% SDS: 2 min, 37.5 ml 20X SSC+0.75ml 10% SDS in 

250ml H20 

IX SSC: 5 min. 12.5 ml 20X SSC in 250ml H20 

0.2X SSC: 5 min. 2.5 ml 20X SSC in 250ml H20 

10 Dry slides in centrifuge, 1000 RPM, Imin. 

[283] Scan using appropriate Photomultiplier tube (PMT) and fluorescent 
excitation and emission channels. 

[284J The results are shown in Table 1 and Table 2. The lists of genes come 
from colorectal tumors from a variety of stages of the disease. The genes that are up 
1 5 regulated in the tumors (overall) were also found to be e?q)ressed at a limited amount or not at 
all in the body map. The body map consists of at least 28 tissue types, including Adrenal 
Gland, Bladder, Bone Marrow, Brain, Breast, Cervix, Colon, Diaphragm, Heart, Kidney, 
Liver, Limg, Lymph Node, Muscle, Pancreas, Prostate, Rectum, Salivary Gland, Skin, Small 
Intestine, Spinal Cord, Spleen, Stomach, Testis, Thymus, Thyroid Trachea and Utems. As 
20 indicated, some of the Accession numbers include expression sequence tags (ESTs). Thus, in 
one embodiment herein, genes within an expression profile, also termed expression profile 
genes, include ESTs and are not necessarily fiill length. 

[2851 Table 1 shows Accession numbers for 1747 genes upregulated in colon 
tumor tissue. The table provides the exemplar accession numbers, Unigene ID numbers, 
25 unique Eos codes, descriptions of the genes encoded, and relative amount of expression as 
compared with expression in other normal body tissue. 
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TABLE 1. GENES INVOLVED IN COLORECTAL CANCER 



PKey Primekey(unique probeset identifier) 
Ex.Accn. Exemplar accession number 
Probeset Eos Code nuniber 
Umgene# Unigene nuniber 



Pkev Probeset Ex Accn UntG iP UniGene Title RatfoTumMet/Bodv 

332264 EOS3219S N72849 HS.11S263 eidreguEn 17.6 

332716 EOS32647 L00058 Ks.7907O vwriyc avian myelocytomatosis viral oncogene hon^ 15.0 

312845 EOS12776 AI911215 Hs.18S555 ESTs 14.3 

310257 EOS10188 AW389247 Hs,148826 ESTs 11.6 

322587 EOS224g8 AF155108 EST duster (not in UniGene) 11.5 

331060 EOS30991 N75081 Hs,21648 ESTs 10.3 

322303 EOS22234 W07459 EST cluster (not in UniGene) 9.6 

301891 EOS01822 AF131855 Hs.108127 Homo sapiens done 25056 mRNA sequence 9.5 

318524 EOS18455 AW2915t1 HsJ253687 ESTs &9 

314001 EOS13932 AW168495 H$.8750 ESTs 7.8 

331183 EOS31114 T40769 Hs.8469 EST 7.3 

315429 EOS15360 AW009951 i-^892 ESTs 7.3 

303344 EOS03275 AA255977 Hs.250646 ESTs; Highly similar to ubtquitinKX)njugatingenzynie[M.mu5Cutu5] 6.7 

313625 EOS13556 AW468402 Hs.254020 ESTs 6.7 

307084 EOS07015 AI160527 EST singleton (not in UniGene) with exon tut . 6.1 

314943 EOS14874 At476797 Hs.184572 ceD division cyde 2; 61 b S and 62 to M 6.1 

303753 EOS03684 AW503733 Hs.170315 ESTs 5.7 

315593 EOS15524 AW198103 Hs.158154 ESTs 5.3 

313604 EOS13635 AI745325 Hs.182286 ESTs; Moderately slrnilar to III! ALU SUBFAMILY SB2 WARNING EI^RY l!ll (H.sapM 5.1 

312319 EOS12250 AA216698 H5.180780 Homo sapiens agrinprecuisor mRNA; parUa^cds 5.1 

312614 EOS12545 AI766732 Hs.201ig4 ESTs 4.8 

323176 EOS23107 AW071648 Hs.123199 ESTs 4.8 

317916 EOS17847 AI565071 Hs.159983 ESTs 4.7 

301846 EOS01777 R20002 H5.6823 ESTs; Wealdy similar to intrinsic fador-BI 2 receptor precursor [H.sapiens] 4.6 

311157 EOS11088 AI990122 Hs.196988 ESTs 4.6 

332640 EOS32571 AA417152 Hs.5101 protein regulator of cytoidnesisl 4.6 

311728 EOS11659 AW083000 Hs.184776 ritwscma) protein L23a 4.5 

313774 EOS13705 AW136836 Hs.144583 ESTs 4.5 

312339 EOS12270 AAS24394 EST duster (not in UniGene) 4.4 

315369 EOS1S300 AA764918 Hs.256531 ESTs 4.3 

303756 EOS03687 AI738488 Hs.115838 ESTs 4.3 

301050 EOSG0981 AW136g73 Hs.144475 ESTs; Weakly similar to mitogen inducible gene mig-2[H.sapiens} 4.3 

300319 EOS00250 AW157646 Hs.153506 ESTs; Weakly similar to micfotut^uIe-acOn cmssOnklng 4.3 

300664 EOS00595 A1444828 Hs.256809 ESTs 4.3 

302655 EOS02586 AJ227892 EST cluster (not in UniGene) with exon hit 4.1 

315175 EOS15106 AI025842 Hs.152530 ESTs 4.1 

330786 EOS30717 D60374 Hs.258712 EST 4.1 

310875 EOS10808 T47764 Hs.132917 ESTs 4.1 

313425 EOS13356 AA745689 Hs.186838 ESTs; Weakly similar to simOar to zinc finger 5 protein from GallusgaiIus;U51 640 [asap*»^^ 4.0 

301804 EOS01735 AA581004 EST chister (not In UniGen^ with exon hit 4.0 

332203 EOS32134 H49388 Hs.102082 EST 3.9 

322968 EOS22899 AI905228 EST duster (not in UniGene) 3.8 

321524 EOS21455 N79126 EST duster (not in UniGene) 3.8 

302476 EOS02407 AF182294 EST duster (not in UniGene) with exon hit 3.8 

303295 EOS03226 AA205625 Hs.208067 ESTs a8 

310016 EOS09947 AW449512 Hs.152475 ESTs 17 

324871 EOS24802 AW29775S Hs.148832 ESTs 3.7 

322887 EOS22818 Aig8630S H$.233460 ESTs; Weakly similar to KIAA0969 protein [Ksapiensl 17 

313171 EOS13102 N67879 Hs.157695 ESTs 3.7 

321638 EOS21569 AI356352 Hs.108932 ESTs 3.7 

320445 EOS20378 R33916 EST duster (not In UniGene) 3.6 

302149 EOS02080 AI383794 Hs.152337 prddn arginine N^nethyltransferase 3(hnRNP nmthyltransferase S. cerevisi^ 3.6 

316905 EOS16836 AW138241 Hs.210846 ESTs 3.6 

313166 EOS13097 A!801098 H5.151S00 ESTs 3.6 

323338 EOS23269 R74219 H5.23348 Si)hase (dnase^ociated protein 2 (p45) 3.5 

311434 EOS11366 AW016607 Hs.201582 ESTs 3.5 

312742 EOS12673 AI650363 Hs.116462 ESTs 3.4 

323587 EOS23516 Ai90S527 Hs.141901 ESTs; Moderately sindlar to IID ALU SUBFAMiLYSPWARNIN6 ENTRY llII[H.sapiens] 3.4 

317390 EOS17321 AW136551 H8.161245 ESTs 3.4 

315282 EOS15213 AI222165 Hs.144923 ESTs 3.4 

318565 EOS18496 . A1440137 Hs.1 64989 ESTs 3.4 

307586 EOS07517 AI285499 EST singleton (not in UniGene) with exon hit 3.4 

321052 EOS20983 AW372884 Hs.240770 nudearc8plMngprotain8UbunII2;20kD 3.3 

324338 EOS24269 AL138357 Hs.247514 ESTs 3.3 

307517 EOS07448 AI275055 Hs.164989 ESTs 13 

314852 EOS14763 AI903735 Hs.137527 ESTs; Weakly similar to X-Hnk&jreOnopathy protein [H.sap!ens] 3.3 

324657 E0824568 AW451142 Hs.255628 ESTs 12 

314912 EOS14843 A1431345 Hs.161784 ESTs 12 

324790 EOS24721 AI334367 Hs.159337 ESTs 12 

315498 EOS15429 AA628539 Hs.116252 ESTs; Moderately simile to IU( ALU SUBFAMILY J WARNING ENTf^ III! [H^aplens] 12 

312857 EOS12788 AA772279 H5.126914 ESTs 12 
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300762 EOS00693 A1497778 Hs.168053 ESTs 3^ 
325587 EOS25518 clZJis giI66824^gn1 1-126724 126987 ex 7 7 CO$l Z44 244 3099 

Cai^gq6682462 3^ 

320654 EOS20585 AW283086 Ks.118112 ESTs 3^ 

S 316715 EOS16648 AI440266 KS.17Q873 ESTs 3.1 
333279 EOS33210 CH22_522FGJ26J^UNK_EMAC005500.GENSCAN^1 

CH2?.FGENES.126J 3.1 

309689 EOS09^0 AW236171 Hs.181357 lamlnin receptor 1 (67kP; nl)osoma) prolein SA) 3.1 

32^8 EOS23777 AA337621 Hs.137635 ESTs 3.1 

10 324876 EOS24609 AI990739 Hs.236511 ESTs; Moderately siniBar to RNAspik^g^ated protein (Rrm 11 

. 308362 EOS08293 AI613519 EST singleton (not In UnlGens) with exon hit 3.1 

308615 EOS08546 AI738593 EST singieton (not bUniGene) with exon hit 3.0 

315397 EOS15328 AA218940 Ks.137516 ESTs 3.0 

302238 EOS02167 AI128608 Ks.167558 ztnc finger protein 161 3.0 

15 321693 EOS21624 AA700017 Hs.173737 ras^latedC3boyinumto;dnsut)stratBl (rhobn)IIy;smaDGTPbhding^ 3.0 

330814 EOS30745 AA015730 Hs.247277 ESTs; Weakty similar to transfonnation^elaled protein 3.0 

302977 EOS02908 AW283124 ESTchister(not!nUniGena)with6xonhil 3.0 
327516 EOS27447 cjljis gi|6117815|req gn 6 199078 199216 ex4 4 CDSI 9.15 139 1551 

CH.0i.hsglI6117815 2.9 

20 333278 EOS33209 CH2?J521FGJ25^UNK.EMAC005500.GENSCAN.7.2 

CH2?J=GENES.125_2 2.9 

302088 EOS02019 077629 Hs.135639 achaete-scute complex (Drosophila) homology 2 2.9 

322718 EOS22&49 AF150270 H5.233322 ESTs; Weakly similar to cONA EST EMBljT01156cornesta this gene M 2.9 
329154 EO$29085 cjOis 9l5868686|rengn 2 -200851 201356 ex 1 3 COSI 30.28 506 1812 

25 CHXJisgi|5868686 2-9 

315978 EOS15909 AA830893 Hs.1 19769 ESTs 2.9 

302677 EOS02608 H63227 Hs.132880 ESTs; Highly similar to ubi(tuifinK»n|ugaSngen^ 2.9 

315007 EOS14938 A1808583 Hs,125291 ESTs 2.9 

303780 EOS03711 AI424014 Hs.243450 ESTs; Moderately similar to KIAA0456 protein [Hxgpens) 2.9 

30 331362 EOS31293 AA417956 Hs,40782 ESTs 2.9 
335815 EOS35748 CH2^.3187F6_618J_UNK.EM:AC005500.GENSCAN.51(W 

CH2^FGENES.618.3 2.8 

332070 EOS32001 AA598545 Hs.228138 EST 2.8 

315720 EOS15651 AW291875 Hs.163900 ESTs 2.8 

35 311913 EOS11844 A1358522 Hs.221417 ESTs ^8 

331014 EOS30945 H98597 Hs.3034O ESTs 2.8 

322035 EOS21966 AL137517 EST cluster (not in UnlGene) ZB 
338057 EOS37988 CH2a.6558F6_UN)eEM:AC005500.GENSCAN.160.1 

CH22_EM:AC005500.GENSCAN.160.1 2.8 

40 335829 EOS3S760 CH22.3202F6_620J_UNK^EM:AC005500.G£NSCAN.512-3 

CH22.FGENES.620.3 2.8 

312136 EOS12067 AW451469 Ks.209990 ESTs 2.8 

303132 EOS03083 A)929819 Hs.193330 ESTs 2.8 

317548 EOS17479 AI654187 Hs.195704 ESTs 28 

45 325585 EOS25516 c1ZJisgiI66824621ref|gn 1 ••-73476 73574 ex 5 7 CDS! 8.5299309 

7 CH.12_hsgiI6682462 27 
334631 EOS34562 CH22^1939F6^416_7.UNK^EM:AC005500.GENSCAN.277-7 

CH22_FGENES.416.7 2.7 
329156 EOS29087 cjehs g{i5868686Irefign 2 - 202013 202341 ex 3 3 CDSf 10.23 3291814 

50 CR)ehsgi|586B686 2.7 

318815 EOS18546 AI133617 Hs.191088 ESTs 2.7 

300734 EOS00665 AW205197 Hs.240951 ESTs 2.7 

324430 EOS24361 AA464018 EST cluster (not In UniGene) 2.7 

322296 EOS22227 W76326 Hs,251937 ESTs 2.7 

55 303842 EOS03773 AI337304 Hs.126268 ESTs; Weakly similar to similar to PDZ domain [Celegans] 27 

320909 EOS20840 D82269 EST cluster (not in UniGene) 27 

325195 EOS25126 T20258 Hs.171443 ESTs; Weakly similar to acltn tending protein MAYVEN(H.sapiens] 27 

324959 EOS24890 AW367745 Hs.143137 ESTs 27 

309997 EOS09928 AI291621 Hs.l45199 ESTs 27 

60 329367 EOS29298 cjOts gil5868842|req gn 1 - 87201 87587 ex 1 4 COS! 8.13 387 3908 

CHXJjsgi|5868842 27 

316697 EOS16628 AW293174 Hs.252627 ESTs 27 

313600 EOS13531 AA429564 Hs.185802 ESTs 27 

301471 EOS01402 AA995014 Hs.l29544 ESTs; WeaWy similar to ORFYLl027to[Sxerevislael 26 

65 300810 EOS00741 AI076890 Hs.186949 ESTs 26 

319976 EOS19907 N48809 Hs.250824 ESTs 26 

313434 EOS13365 W92070 Hs.231902 ESTs 26 
333849 EOS337B0 CH2a»1l18FG_2903.UNK_EM:AC005500.6ENSCAN. 146-7 

CH22^FGENES.290J 26 

70 330744 EOS30675 AA406142 Hs.l2393 dTDP>D^Iucose 4;6Klahydratase 26 

309398 EOS09329 AW081820 EST singleton (not in UniGene) with exon hit 26 
338727 EOS38658 O122.7523F6_UNK_EMAC005500.GENSCAN.500-2 

CH22_EM:AC005500.GENSCAR500-2 26 

„^ 324620 EOS24551 AA448021 EST duster (not in UniGene) 26 

75 335755 EOS3568S CH22J122FGJ04_4^UNK^EM:AC005500.GENSCAN.493-9 

CH2^.FGE^IES.604J 26 

315858 EOS15789 AA737345 EST cluster (not in UniGene) 26 

307288 EOS07219 AI205169 EST singleton (not in UniGene) wfthexonhJI 25 

330542 EOS30473 U23942 Hs^13 cytochronnsP450;51 (lanosterol14^ph»lemethytase) 25 

80 335896 EOS35827 Oi22L3273FG_635.4_UNICEM:AC005500.GENSCAN.525^ 

CH2^ENES,635.4 25 

316578 EOS16509 AA775623 Hs.211883 ESTs 25 
329193 EOS29124 cjUiS gi|5868716trBQgn 3 1.168095 168181 ex 99 COSI -1.1 187 2084 

CHJU»giI5^16 25 

85 315193 EOS15124 AI241331 Hs.131765 ESTs 25 

319478 EOS19409 R06841 EST cluster (not in UniGene) 25 
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334727 EOS34658 CH22^2038FGJ24J_UNK.EM:AC00550aGENSCAN.28M 

328113 EOS28044 cJJisgi|^68024|reqgn2.8037880491ex23COSi 3i891143247 

Qi06Jisg(|58S8Q24 2.5 

315214 EOS15145 AI915927 Hs.34771 ESTs ^5 

324718 EOS24649 AI557019 Hs.116467 ESTs 2.5 

313326 EOS13257 AI088120 Hs.122329 ESTs 2.5 

319480 EOS19411 R06933 Hs,184221 ESTs ^5 

317902 EOS17833 AI828602 Hs.211265 ESTs ^5 

323341 EOS23272 AL134875 Hs.192386 ESTs ^S 

336003 EOS35934 ai2L3385FG.664.4_UNiaXI321ia6ENSCAN.S4 

CH2?JGENES.664_4 ZS 

322992 EOS22923 AA142B91 Hs.193165 ESTs 2.5 

314911 EOS14842 AW292329 ^.163481 ESTs 2.5 

313603 EOS13534 AW468119 ESTctuster (notinUniGene) 2.5 

308469 EOS06400 AA983792 ESTsingteton(ROttnUniGen8)wflhexonhtt Z5 

324715 EOS24646 AI739168 ESTctust8r(R0tinUni6en8) 2.5 

302455 EOS02386 AA356923 Hs.240770 nuclear C£|pUn(DngpratBlnsiibuii» 2; 201(0 Z4 

321023 E0Sm4 H25135 Hs.125608 ESTs 2.4 

302099 EO^)2030 ALJ021397 Hs.137576 ribosomd protein L34ps6Udogene1 2.4 

314092 EOS14023 Ai984040 H5^26946 ESTs 2.4 

318587 EOS18516 AA779704 Hs.168830 ESTs ^4 

303702 EOS03633 AW500748 Hs.224961 ESTs; Wea)dysiniarto73kDAsubunttofdeava9eandpoIyadefQrl^^ Z4 

301622 EOS01753 X17033 Ks.1142 {nlegrin;fidplia2(CD49B;8)pha28ubunHofVlA-2r8ce^ 2.4 

322694 EOS22^5 AI110872 EST cluster (not In Uf^Gene) Z4 

323333 EOS23264 AA228883 EST cluster (notinUniGene) 2.4 

301954 EOS018d5 AJ009938 Hs.1 18138 nuclear receptorsutifamilyl; group I; member 2 « 2.4 

331383 EOS31^4 M421562 H5.91011 antedor gradient 2 (Xenepuslaevl5)homolog 24 

303811 EOS03742 AW182340 Ks.246155 ESTs; Weakly similar to DNATOPOISOMERASE I [H.saplens] U 

308243 EOS08174 A1560037 EST singleton (not in UniGene)wthexonh'it 2.4 

338021 EOS35952 CH22.3404FG.669.10_UNKJ)J32110.G£NSCAN.9-15 

CH22^FG£NES.669_10 2.4 

334789 EOS34720 CH22JZ10iroj3^14„UNICEMAC00550aGENSCANmi7 

CH2?_FGENES.43a.14 24 

320807 EOS20738 AA086110 Hs.188536 Homo sapiens done 24838 mRNA sequence 24 

328903 EOS28834 cJJisgi|S868514|reqgn U 23625 24468 ex 3 5 COS! 91.18 844 219 

CH.08_hsgi|5868514 24 

338769 EOS38690 CH22.7581FG_UNK.EMAC005500.GENSCAN.517-6 

CH22.EfAACC05500.GENSCAN.517-6 23 

333769 EOS33700 CH22_1036FGJ71_8_UNK_EM:AC005500.GENSCAN.127-8 

CH22.FGENES.27L8 23 

303597 EOS03528 A1792141 Hs.143560 ESTs; Weakly similar to brain mitochondrial canter prote!n-1 23 

305898 EOS05829 AA872838 Hs.242463 keraflnS 23 

304439 EOS04370 AA398882 EST singleton (not in UnlGene)witliexDn tut 23 

301604 EOS01535 AA373124 Hs.105837 ESTs; WeaWy similar to C17G10.1 [Celegans) 23 

315071 EOS15002 AA552690 H3.152423 ESTs 23 

330565 EOS30496 U51095 H5.1545 caudal type bomeo box transcr^iilbn factor 1 23 

331589 EOS31520 N71027 Hs.41856 ESTs 23 

303216 EOS03147 AA581439 H8.152328 ESTs 23 

324988 EOS24919 T06997 EST clusler (notinUniGene) 23 

312996 EOS12927 AA249018 EST duster (notinUniGene) 23 

332314 EOS32245 T25662 Hs.101774 ESTs 23 

313325 EOS13256 A!420611 Hs.127832 ESTs 23 

322991 EOS22922 C18965 Hs.159473 ESTs 23 

335498 EOS3S427 CH2?J848FG_571_4JJNK_EMAC00550aGENSCAN.460^25 

CH22LFGENES.571.4 23 

315135 EOS15066 AA627551 Hs.192446 ESTs 23 

319488 EOS19419 AW250340 EST duster (not in UniGene) 2.3 

323571 EOS23502 AAg84133 H5.153260 c<%)-interacting protein 23 

322826 EOS22757 AI607863 Hs.156932 ESTs 23 

322221 EOS22152 AI890619 Hs.179662 nucleosome assembly protein 1-like 1 23 

312242 EOS12173 AI380207 HS.12S276 ESTs 23 

315238 EOS15169 AAS93867 Hs.170890 ESTs 23 

315168 EOS15099 AA622130 H8.152524 ESTs 23 

300504 EOS00435 AW204624 H8.192927 ESTs; Weakly simBar to Urn kinase |H.sap{ensl 23 

323243 EOS23174 W44372 EST duster (not In UniGene) 23 

331628 EOS31559 R80965 Hs.2O4079 ESTs 23 

320746 EOS20677 AA128302 EST cluster (not in UniGene) 23 

324598 . EOS24529 AA502659 Hs.163986 ESTs 23 

308867 EOS08598 A1758754 EST singleton (not in UniGene) with exon hit 22 

302944 EOSQ2875 AA340708 HS.2S6204 ESTs; Weakly siniilar to cydicnudeotide^aledchannd beta subunlt[R.norvegicus] 22 

316291 EOS16222 AW375974 Hs.156704 ESTs 22 

31529S EOS15227 AA876905 H8.125286 ESTs 22 

334150 EOS34081 C»l2L1429FG_339_LUNICEM:AC00550aGENSCAN.189.1 

CH22/GENES.339J 22 

331380 EOS31311 AA453266 Hs.246131 ESTs 22 

321795 EOS21726 AI796896 Hs^446 ESTs 22 

331493 EOS31424 N34357 Hs.44571 ESTs 22 

312890 EOS12821 AI813654 Hs.127478 ESTs 22 

315563 EOS15514 AW003622 Hs.126555 ESTs 22 

314306 EOS14237 AI697901 Hs.192425 ESTs 22 . 

314138 EOS14069 AA740616 EST duster (nol In UniGene) 22 

302656 EOS02587 AW293005 Hs.220905 ESTs 22 

313564 EOS13495 AA810141 ^.192182 ESTs 22 

332792 EOS32723 CH2?_8FGJJUJNK-C4G1.GENSCAN.a.2 

CH2^.FGENES.3_? 22 
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332020 EOS31951 AA488895 Hs.105219 ESTs 2^ 

315143 EO815074 AA878324 Hs.1d2734 ESTs 2-2 

313385 EOS13316 AW320a7 Hs.176711 ESTs 2^ 

323835 EOS23766 ALO420O5 EST duster (not ^UnfGene) 2:2 

314014 EOS13945 AW291847 Hs.121715 ESTs: W^aMy similar to HP protein [Ksaplens] 2.2 

336016 EOS35947 CH2a.3399FG^669_5LUNKJ3J32IiaGENSCAN.9-10 

CH2a„FGENES.669.5 2.2 

323218 EOS23149 AF131646 KS.133S6 Homo sa{to done 25028 mRNA sequence ZZ 

338059 £0337990 CH22.6561FGL.UNK.EAAAC005500.GENSCAR1604 

CH22_EM:AC005500.GENSCAN.1604 12 

302613 EOS02544 AA371(^ Hs.251638 ubiquiGn spedfic pn)lease 3 2.2 

304852 EOS04783 AA588595 EST singleton (not in UntGene) with exon hit 2.2 

308457 EOSOSSSa At669859 EST skigtetoft (not taUnlGene) with axon hit 22 

311736 EOS11667 M765897 EST duster (not in UniGene) 2.2 

334183 EOS34114 CH22_1464FG.350J3.UNK_EMAC005500.GE1^CAN.209-16 

CH22J(^Ea350J3 2.2 

315021 EOS14952 AAS33447 EST duster (not in UniGena) 12 

303013 EOS02944 F07898 Hs.214190 intefteuicin enhancer binding factor 1 2.2 

315006 EOS14937 AI538613 Hs.135657 ESTs 2.2 

337534 EOS37465 a^2^,5B03FG.828J_ CH2aj=GENES.82S^ 2.2 

303276 EOS03207 AA431599 Hs.132799 ESTs Z1 

318617 EOS18548 AW247252 Hs.75514 nudeoside phosphorylase l^ 

330760 EOS30691 AA448883 Hs.30469 ESTs 2.1 

319545 COS19476 R83716 Ks.14355 ESTs 2.1 

312252 EOS12183 A1128388 Hs.143655 ESTs 21 

322882 EOS22813 AW248508 Hs.2491 DiGeorgs syndrome critical region gene 2 21 

312684 EOS12615 AW294020 Hs.1 17721 ESTs 21 

315782 EOS15713 AW515455 Hs.1 15558 ESTs; Weakly similar to 111! ALU SUBFAMILY J WAR^flNG EmKY llll (asapie^^ 21 

320076 EOS20007 AI653733 Hs.204079 ESTs 21 

300566 EOS00497 H85709 Hs.21371 sonofseven)es5{Drosophiia)homdog 1 21 

300908 EOS00839 AA616335 H5.146137 ESTs; Weakly simOar to putative [&eiegans] 21 

314778 EOS14709 AWa79559 Hs.152258 ESTs 21 

319233 EOS19164 R21054 Hs.211522 ESTs 21 

335488 EOS35419 CH22J840FG„570_20_UNK.EMAC005500.GENSGAN.460-15 

CH2?_FGENES.570^0 21 

334816 EOS34547 CH22J923FGjH1J5_UNKBflAC005500.GENSCAN.274-22 

CH2ajGENES.41L15 21 

306792 EOS06723 AI042426 EST singleton (not In UniGene) with exon hit 21 

301661 EOS01592 Ai815558 EST duster (not in UniGene) with exon hit 21 

311332 EOS11263 AW292247 Hs.255052 ESTs 21 

314785 EOS14716 A1538226 Hs.135184 ESTs 21 

301460 EOS01391 AW196758 Hs.165998 DKFZP564M2423 protein 21 

332015 EOS31946 AA487910 Hs.2O8800 ESTs; Weakly sinillar to !l!l ALU CLASS B WARNING ENTRY l!ll[H.sapiens] 21 

321529 EOS21460 AI269506 K5.146066 ESTs 21 

323740 EOS23871 AA324643 Hs.246108 ESTs 21 

336019 EOS35950 CH22_3402FG_669_8_LlNieDJ32110.6ENSCAN.9-13 

CH2^GENES.669J 21 

314954 EOS14885 AA521381 Hs.l87726 ESTs 21 

303037 EOS02968 AF118395 EST duster (not in UniGene) with exon hit 21 

302056 EOS01987 AI457532 Hs.126082 ESTs; Moderately similar to ROSA26AS [M.musculusl 21 

315178 EOS15109 AW362945 Hs,162459 ESTs 21 

332246 EOS32177 N57927 Hs.120777 ESTs; Weakly similar to RNA POLYMERASE II ELONGATION FACTOR ELL2[H.sapiens) 20 

334288 EOS34219 CH22_1S77FG.369J8JJNK.EMAC005500.GENSCAN.229-18 

CH22JGENES^9J8 20 

324690 EOS24621 N68286 Hs.132808 ESTs; WeaMy Similar to Similar to apombe-rad4«A»(5nu^ 20 

305257 EOS05188 AA679005 EST singteton (not In UniGene) with exon hit 20 

311315 EOS11246 AW450536 H&2O9260 ESTs 20 

311988 EOS11919 AW016096 Hs.13801 ESTs - 20 

302638 EOS02569 AA463798 Hs.10269S ESTs; We^ similar to CI lD24[C.e!egans] 20 

320531 EOS20462 W03691 Hs.248d4 ESTs; Moderately similar to RNApdyinerasel assodated factor [Mjnuscdus] 20 

323604 EOS23535 AI751438 Hs.182827 ESTs; Weakly similar to 1111 ALU SUBFAMILY SQ WARNING ENTRY llll(H.sap!en$] 20 

308852 EOS08763 AI829848 Hs.182937 p8ptklylprolylisomeraseA(cydbphllinA) 20 

320521 EOS20452 N31464 Hs.24743 ESTs 20 

331306 EOS31237 AA252079 Hs.63931 dadishund{DrDSOphfla)homolog 20 

314941 EOS14872 AA515902 Hs.130650 ESTs 20 

336684 EOS36615 CH22^4167FG 46_1 CH22 FGENES.4S.1 20 

301137 EOS01058 AF049569 H5.137095 ESTs 20 

338454 EOS38385 CH22.7128FG_UNKJEM:AC00550a6ENSCAN.3604 

CH22.EM:AC005500.GENSCAN.38(M 20 

309700 EOS09631 AW241170 Hs.179661 Homo sapiens done 24703 beta^bulinmRNA; complete cds 20 

330262 EOS30193 cJ5j)2gq8871884|gbiAgnU67913680536x33CDSl &41 141 597 

CH.05^2gl|6671884 20 

324163 EOS24094 AL046827 Hs.134651 ESTs 20 

316493 EOS16424 AA766142 Hs.131810 ESTs; Weakly slmSar to !U1 ALU SUBFAMILY J WARNING ENTRY llll [Ksaplens] 20 

311673 EOS11804 AA730045 Hs.187866 ESTs 20 

328757 EOS26688 c20Jis gi|6249610|fef|gn 3 -^74531 74597 ex 1 3 CDSf 9.52671416 

CH.20Jisgi|8249610 20 

319167 EOS19098 F05984 Hs250138 protein phosphatese 2C; magneslunKlependenI; catadytic subunit 20 

316011 EOS15942 AW616953 Hs.201372 ESTs 20 

313635 EOS13566 AA507227 Hs.6390 ESTs 20 

310027 EOS09958 AW449009 Hs.126647 ESTs 20 

336662 EOS36593 CH22.4138F(L41J_ CK22JGENES.41.1 20 

334648 EOS34579 CH22.1956FG_417J5.UNK-EMACQ0550aGENSCANJ278-15 

CH22JGEME&417J5 20 

308676 EOS08607 Ai761036 EST sbtgleton (not in UniGene) wllh exon hH 20 

312047 EOS11978 AA588275 Hs.14258 ESTs 20 
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324826 EOS24^ AA704806 Hs.143842 ESTs 2,0 

322889 EOS22820 M081924 Ha211417 ESTs 2.0 

316345 EOS16276 AW139408 HS.1S2940 ESTs 2.0 

. 313922 EOS138S3 AI702038 KS.1G0057 ESTs 2.0 

5 319423 £0819354 TB3024 Hs.15119 ESTs 2.0 

320244 EOS^irS AA296922 Hs.129778 gastrdntssOna) pepfids 2.0 

308957 EOS08888 AI669842 ESTs!R9lebn(notinUnIG8ne)wilhexonhit 2.0 

334223 EOS34154 CH22,iar7FGJ60J_UNieEM:AC00550aGENSCANJ2184 

^ ^ CH22J=GENES.360_4 1.9 

10 302980 EOSQ2911 W93435 EST duster (not In UniGene) with exon hit 1.9 

312153 EOS12084 AA7K250 Hs.153028 cytochraro 1^561 1.9 

326460 EOS28391 c19_hs gi|5867400|rel|gn 3 -142833 142935 ex 1 2 COSl 19.03 3031731 

Cai9jhsgi|5867400 1.9 

319962 EOS19893 H06350 Hs.135056 ESTs 1.9 

IS 307064 EOS06995 A1149335 EST singleton (not in UniGens) with exon hit 1.9 

331608 EOS31539 N89861 Hs.44162 ESTs; Weakly a'milar to cDf^ EST yk342h12.5 comes finxn this gene [Qel^^ 1.9 

328142 EOS28073 c_6Jisgi|5868050|reflgn 1 -9856 9778 ex 26 C0SI1 1.11 123 3339 

CH.06Jsgi|58680S0 1.9 

312527 EOS12458 A1695522 Hs.191271 ESTs 1.9 

20 318581 EOS18512 AA7690a EST duster (not In UnKSene) 1.9 

319979 EOS19910 AB018281 Hs.107479 K1AA0738 gene product 1.9 

338107 EOS38038 CH2^3496F6_696XUNKJ)A59H18.GENSCAN.4-3 

CH2a.FGENES.696^3 1.9 

30S232 EOS05163 AA670052 Hs.195188 giyceraldehyde^phosphate dehydrogenase 1.9 

25 315043 EOS14974 AA806538 Hs.130732 ESTs 1.9 

323377 EOS23308 AA133260 Ks.8454 protein kinase; cAI\ilP-dependent- regulatory; type II; alpha 1.9 

338260 EOS38191 CH22.6863FG_UNieEM:ACO0S50aGENSCA(i279-1O 

CH22.EMAC005500.GENSCAN.279-10 1.9 

334891 EOS34822 CH22,2208FG.45?_5.UN}CEM:AC005500.GENSCAN.341-8 

30 CH22.FGENES.452.5 1.9 

316055 EOS15986 AA693880 EST duster (not in UniQene) 1.9 

312414 EOS12345 AI915014 Hs.164235 ESTs; Weakly simBar to Ilii ALU SUBFAMILY J WAITING ENTRY ill! [H.sapi6ns] 1.9 

300225 EOS00156 AI989g63 Ks.197505 ESTs 1.9 

332607 EOS32538 R41791 Ks.36566 LIM domain kinase 1 1.9 

35 312405 EOS12336 AI523875 EST duster (not In UniGene) 1.9 

313605 EOS13536 AI761786 Hs.204674 ESTs 1.9 

337755 EOS37686 CH2^.6105FG_L!NK_EM:AC000097.GENSCAN. 109-2 

CH22-EM:AC000097.6ENSCAN.109-2 1.9 

323216 EOS23147 AA332145 EST duster (not In UniGene) 19 

40 334872 EOS34803 CH22J2186FG.450XUNK^EM:AC005500.GENSCAN.339-2 

CH22 FGENES.450_2 1.9 

332034 EOS31965 AA489847 Hs.112019 ESTs; Moderately simOar to 111) ALU SUBFAMILY J WARNING ENTRY 111! [H.sapiens] 1.9 

332103 EOS32034 AA609161 Hs.112657 ESTs; Weakly similar to GRFYOR243G[S.cerevlsiae] 1.9 

318198 EOS16127 At056776 Ks.133397 ESTs 19 
45 329141 £0829072 cjOs giI80170801reflgnU 343924 343997 ex 2 3 CDSi ^53 74 1715 

CHX.h5gil6017060 1.9 

321539 EOS21470 N96619 Hs.62461 ARP2 (adin^elated protein 2; yeasQ homdog 1.9 

313881 EOS13812 AA535580 Hs.16331 ESTs 1.9 

314046 EOS13977 AW021917 Hs.181878 ESTs 1.9 
50 336045 EOS35976 CH2^.3430FG_679J_UNK.DJ32i10.GENSCAN.18^ 

C^2^FGENES.679_7 1.9 

324799 EOS24730 AW272262 Hs.250468 ESTs 1.9 

312656 EOS12587 AW152449 Ks.226469 ESTs 1.9 

324662 EOS24593 AW504689 EST duster (not In UniGene) 1.9 

55 323930 EOS23861 AA570698 Hs.193203 ESTs 1-9 

314465 EOS14396 AA602917 Hs.156974 ESTs 1.9 

335897 EOS35828 CH22_3274FG_635.5^UNK_EM:AC005500.GENSCAN.525.7 

CH22.F(XNES.635_5 1.9 

321746 EOS21677 AI806500 Hs.102652 ESTs; Weakly simSar to KIAA0437 [lisapiens) 1.9 
60 335687 EOS35618 CH2^3048FGL59Q„2»UNieEM:AC005500.GENSCAM488.2 

CH22_F6ENES.596.2 1.9 

330731 EOS30662 AA278816 Hs»177204 ESTs 1.9 

315542 EOS15473 AA079476 Hs.109857 ESTs; Highly similar to CG1^9 protein [H.sapiens] 1.9 

336379 EOS36310 CH2^_3791FG_821J.UNK_BA232E17.GENSCAN.4.19 

65 CH22_FGENES.821_7 1.9 

305691 EOS05622 AA813590 Hs.119500 karyopherin alpha 4 (importin alpha 3) 1.9 

310639 EOS10570 AW269082 Hs.175162 ESTs 1.9 

327481 EOS27412 cXhs gi|5867783|refjgn 3 ♦104472104673 ex 1 4 COSf 14.33 202 1308 

Ca02JisgiI5B67783 1.9 

70 301910 EOS01841 T84852 Hs.98370 cytodiromeP540 family member predided from ESTs 1.9 

335478 EOS35409 CH22 2830FG_569J_UNK.EMAa)05500.GENSCAN.456-1 

CH22.FGENE&569J 1.9 

331135 EOS31066 R61398 Hs.4197 ESTs 1.9 

335690 EOS35621 CH22.3051FG.596.5.UNK.BftA000550aGENSCAN.48S.5 

75 CH22JGENES,598.5 1.9 

308047 EOS07978 A1459633 EST singleton (not in UniGene) with exon hit 1.9 

334500 EOS34431 CH22_1800FG 397J6 UNieEM:AC00550aG£NSCANmi8 

CH22J'GENES.397.16 1.9 

338250 EOS38181 CH22_6848FG_UNK_EM:AC005500.GENSCAN.269. 

80 2 CH22.EMAC00550aGENSCAN.269-2 1.8 

320618 EOS20549 AI220276 H5.235228 EST 1.8 

335044 EOS3497S (M2^2367FGJ80J.UNKJM:AC0(15500.GENSCAN.374.1 

CH22J6ENES,480J • 1.8 

313789 EOS13720 AI167810 Hs.217743 ESTs 1.8 

85 311911 EOS11842 Alt]87123 Hs.114434 ESTs; Weakly sinnlar toil!! ALU SUBFAMILY J WARNING ENTRY !lil[H.sapiens) 1.8 

320180 EOS20111 AA846203 Hs.193974 ESTs; Weakly similar to alternatively spliced product using exon 13A[H^ 1.8 
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311036 EOS10967 AI539227 Ks:!14039 ESTs l-B 

323303 EOS23834 AA773580 Hs.193598 ESTs 1-8 

318676 EOS18607 157448 Hs.1S487 ESTs; Moderalsly similar to putaSvaphosphdnosi^^ 1.8 

303007 EOS02938 AA478876 Hs.7037 paSd (mouss) homolog; palDdai 1-8 
334806 EOS34737 CH2a.2119FG.435J_IJ^0eB^C0(S500LGENSCAN.29&^ 

CHZLFGEMES.435_7 18 

311767 EOS11698 A1076686 Hs.190055 ESTs ~ 1.8 

331750 EOS31681 AA284372 Hs.111471 ESTs 18 

314872 EOS14803 A1144254 Hs^26 ESTs 1.8 

314071 EOS14002 M192455 Ks.188690 ESTs 1.8 
328450 EOS28381 cjjis giI5888425|rel|gn 2 - 209192 209321 ex 2 3 CDS! ia41 1301407 

CH.07JIS ^868425 1.8 
328857 E0S^8 cjjis 8q6381927|r6f|gn 3- 80557 81051 6x1 1 CDSo 41^1 4956090 

CH.07.hsgil6381927 1.8 

313781 EOS13712 AA078836 EST cluster (noi in UniGene) 18 

336953 EOS36884 CH2L4746FG.36L22. CH2?J6ENES,361-22 1.8 

300233 EOS00164 A1380777 Ks.189402 ESTs 1.8 
326862 EOS26793 c^Jis ^|6552465|ref|gn 2 107702 107782 ex 12 13 COSi 3.6281 2149 

CH.20_hsgl|8552465 1.8 

312364 EOS12295 R40111 Hs.187618 ESTs 1.8 

321541 EOS21472 AI220292 Hs^54467 ESTs 1.8 

307432 EOS07363 AI244259 Hs.181165 eukaryotic translation elongation factor 1 alpha 1 1.8 

320921 EOS20852 R94038 H5.199S38 inhlt)fn;b8taC 1.8 
333110 EOS33041 CH22.338FG.79.16.UNieEMAC000097.6EN8CAN.59-15 

CH22_FGENES.79J6 1.8 

324914 EOS24845 M847510 Hs.161292 ESTs 1.8 

312681 EOS12612 AI028149 Hs.193124 pyruvate dehydrogenase kinase; isoenzyme 3 1.8 
335697 EOS35628 ai22«3058FG_596_12_UNieEMAC005500.GENSCAN.488.13 

CH22_FGENES.596J2 1.8 

308462 EOS03393 AI671311 EST singleton (not in UniGene) with exon hit 1.8 

312138 EOS12069 TB9405 Hs.218851 ESTs; WeaHy similar to BHAUi SUBFAMILY J WARNING ENTRY HI! [H^aplensJ 1.8 

309116 EOS09047 AI927149 H&29797 ribosomal protein LI 0 1.8 

320730 EOS20661 AA534539 H1151072 ESTs 1-8 

300844 EOS00775 AL042759 Hs.191762 ESTs 1.8 
337570 EOS37501 CH22„5856FG_UNieC65E1.GENSCAN.4-2 

ai2?_C65E1,GENSCAN.4-2 1.8 

332756 EOS32687 D63479 Hs.115907 diacylglycerd kinase; delta (1301(D) 1.8 

332161 EOS32092 AA621523 Hs.165464 ESTs 1.8 

300942 EOS00873 AW275006 Hs.195959 ESTs 1.8 

300680 EOS00611 AW466066 Hs^57712 ESTs; Weakly similar b KiAA0985 protein [H^apiens] 1.8 
328783 EOS28714 c_7 hs gi|5868309Irefl gn 5 - 73858 73822 ex 2 5 CDS! 0.78 165 5371 

CH.07_hsgi|5868309 1.8 

307542 EOS07473 A)280859 EST singleton (not In UniGene) with exon hit 1.8 

331975 EOS3t906 AA464972 K&99624 ESTs 1-8 

321532 EOS21463 T77886 H8.83428 nudear factor of kappa light polypeptide gene enhancer in B.celis 1 (pl05) 1.8 

318721 EOS18652 Z28504 EST cluster (not in UniGene) 1.8 

302124 EOS02055 AB023967 Hs.145076 regulator of differentiation On S.pombe)1 1.8 

323541 EOS23472 AI185116 Hs.104613 ESTs; Weakly simflar to SinuTar to S.cerevisiae hypothetical protein L31 11 [H-sapi^^ 1.8 

331057 EOS30988 N71399 Hs.28143 ESTs 1.8 

316860 EOS16791 AW139099 Hs.127489 ESTs 1.8 

330601 EOS30532 U90916 iHs.82845 Human clone 23815 mRNA sequence 1.8 

307334 EOS07265 AI214811 Hsi20815 ESTs; Weakly similar to TRW protein tH^sapiensl 1.8 

323195 EOS23126 AIQ64982 Hs.117950 nndtifuncttona) polypeptkie similar to SAICAR synthetase and Al^ 1.8 

303856 EOS03787 AA968569 Hs.944 glucose phosphate isomerase 18 

321553 EOS21484 H92449 Hs.116406 ESTs 1.8 

332705 EOS32636 T59161 Hs.76^ thymosin; beta 10 1.8 
333139 EOS33070 CH22„368FG 83J6.UNK.EM:AC000097.GENSCAN.67-19 

CH22^FGENES.83J6 1.8 
338997 EOS38928 CH22_7881FG_UNK_0A59H18,GENSCAN.8-22 

CH22.DA59H18.GENSCAN.8.22 1.8 

301509 EOS014W AI025435 Hs.117532 ESTs 1.8 

314522 EOS14453 AI732331 Hs.167750 ESTs; Moderately similar to Ull ALU OASSC WARNING E^m^UII[H.sapiens] 1.8 

303072 EOS03003 AF157633 EST duster (not In UniGene) with axon hit 1.8 

305271 EOS052Q2 AA67g895 EST singleton (not in UniGene) with exon hit 18 

335287 EOS35218 CH22J629FG 526 11 UNK.EM:AC005500.GENSCAN.4204 

CH22_FGENES.526J1 1.8 

321286 EOS21217 AIM0940 EST duster (not in UniGene) 1.8 

318740 EOS18671 NM.002543 EST duster (not in UniGene) 1-8 

323465 EOS23396 AA287408 EST duster (not in UniGene) 1.8 

300611 EOS00542 N75450 EST duster (not in UniGene) with exon hit 1.8 

306235 EOS08166 AA932299 EST singleton (not In UniGene) with exon hit 1.8 

336721 EOS36652 CH22.4244FG_83J7_ CH2^JGENES.83-17 1.8 

311291 EOS11222 AA782601 Hs.122684 ESTs 1.8 

310247 EOS10178 A1224982 Hs.211454 ESTs 1.8 

316564 EOS16495 Ai743571 Hs.168799 ESTs; Weakly similar to lUI ALU SUBFAMLY J WARNING ENTRY llll (H.sapiens] 1.8 
328170 EOS28101 c.6Jis gil5868071|ref}gnU 931 70 93295 ex 9 9 COSI 13^1 1263591 

Oi05Jsgi|5868071 1.8 

300909 EOS00840 AV\^79 Hs.154903 ESTs; Weakly sindlar to At)l substrate enapXmdanogaslef] 1.8 

330869 EOS30800 AA115197 Hs.183702 ESTs 1.8 

311048 EOS10979 AA5Q6952 Hs.210508 ESTs 1.8 
333764 EOS33695 CH2L1 031 FG^271_3_UNICEM:AC005500.GENSCAN.1 27-3 

CHZLFGENES.271J 1.8 
338862 EOS38793 CH22J715FG_UNKJW32I10.GENSCAN.1-6 

CH2i.DJ32fiaGENSCAN.1.6 1.8 

331467 EOS31398 N22206 Hs.43112 ESTs 1.8 
327742 EOS27673 cJ5Js gil5867944|ref|gn 3- 143307 143512 ex 1 3 COSl 11.07 206 172 
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Ca05.hsgi|58S7944 1.8 

320955 EOS20885 AL049415 Hs^04290 H(Hno sapiens inRNA:cDNADKFZp588N2119^c^ U 

323589 EOS23520 AW390054 Ks.192843 ESTs U 

319951 EOS19882 AA307665 Hs.14559 ESTs 1.8 
5 333783 EOS336S4 CH2?.1()30ro.271JLUNK^EM:AO(X)^GB4SCAN.127-2 

CH22_FGENEa27L2 1.7 

331046 EOS30977 N66563 Hs.191358 ESTs 1.7 

320001 EOS19932 AA873350 EST duster (not in UniGene) 1.7 

316869 EOS16800 A1954880 Hs.134604 ESTs 1.7 

10 310774 EOS10705 AW134483 Hs.164371 ESTs 1.7 

319379 EOS19310 T91443 Hs.193963 ESTs 1.7 

321549 £0321480 AA470984 Hs.161947 ESTs 1.7 

300823 EOS00754 A1863068 Hs.222665 ESTs; V\te^MIar to putatira zinc finger p^lynNY•l^^^ antigen 1.7 

324228 EOS24159 AI798146 Hs.207780 ESTs 1.7 

15 313902 EOS13833 AI308165 Hs.156242 ESTs 1.7 

308928 EOS0S859 AI863908 EST singleton (not in UniGene) with exontiit 1.7 

333770 EOS33701 ai22_1037FG 272_LUNK^^C00550aGENSCAN.127-10 

CH2^^GENES^1 1.7 

316934 EOS16865 A1571647 Hs.146170 ESTs 1.7 

20 313219 EOS13150 N74924 Hs.182099 ESTs 1.7 

317360 E0S17ai A1125252 Hs.126419 ESTs 1.7 

303530 EOS03481 AI274851 Hs^8744 ESTs 1.7 
334733 EOS34670 CH22_205lFGJ24J4.LiN)eEM:AC005500.GENSCAW.285-16 

CH22J^GENES.424J4 1.7 

25 337670 EOS37601 CH22_5996FGL_UNieEM'AC000097.GENSCAN.57-2 

CH22.EM:AC000097.GENSCAN.57-2 17 

312079 EOS12010 n9745 Hs.189717 ESTs 1.7 

320211 EOS20142 AL039402 Hs.l 25783 DEME-6 protein 1.7 

316218 EOS16149 AW207642 Hs.174021 ESTs 1.7 

30 335682 EOS35613 CH22_3043FG_595_2_UN1CEM:AG005500.GENSCAN.487-11 

CH2^,F6ENE&595J 1.7 

330696 EOS30627 AA022632 Hs.15825 ESTs 1.7 

314449 EOS14380 AL042667 Hs.225539 ESTs 1.7 

311972 EOS11903 N51511 Hs.188449 ESTs 1.7 

35 307691 EOS07622 AI318285 Hs.l82371 prottiymosin; alptia (gene sequence 28) 1.7 
338249 EOS38180 CH2^6847FG^UNieEM:AC005500.GENSCA>J.269-1 

CH2a_EM:AC005500.GENSGANmi 1.7 
326399 EOS26330 clOJis gi|5867353|req gn U 6385 6536 ex 6 6 CDSI 10.69 152 684 

CH.19^iisgll5867353 1.7 

40 313290 EOS13221 AJ753247 Hs.206454 ESTs 1.7 

301615 EOS01546 W39477 EST duster (not in UniGene) with exonttit 1.7 

307034 EOS06965 A1142526 ESTsingieton(notinUniGene) witliexonhit 1.7 

313577 EOS13508 AA56S051 Hs.155029 ESTs 1.7 

324703 EOS24634 AB009282 Hs.31086 Homo sapiens mRNAfbr cytochrome b5:parfialcds 17 

45 321317 EOS21248 AI937060 Hs.202040 ESTs; WeaMysimBar to KIAA0938 protein [Rsapiens] 1.7 

312278 EOS12209 AW205234 Hs.201587 ESTs 1.7 
333358 EOS33289 CH2^.604FGJ4U9.UNK_EMAC005500.GENSCAN.21-9 

CH22J=GENES.14L9 1.7 

322735 EOS22665 AA086123 EST duster (not in UniGene) 1.7 

50 326752 EOS26683 c20_hsgil5867615|refign 1-1214 1562 ex 22 CDSf 33.07 349 1386 

CH2D>g)15867615 1.7 

314733 EOS14664 AW452355 Hs.256037 ESTs 1.7 

312902 EOS12833 AW292797 H5.130316 ESTs 1.7 

322653 EOS22584 AI828854 Hs.l71891 ESTs 1.7 

55 336015 EOS35946 CH22.3398FG_669_4„UNK.DJ32110.6EI4SGAMM' 

CH2^.F6ENES.669_4 1.7 

324500 EOS24431 AW269819 Hs.169905 ESTs 1.7 

310900 EOS10831 Ai922728 Hs.165803 ESTs; Weakly similar to ALU SUBFAl^AILY SB WARI^NG ENTRY llillH^ 1.7 
337908 EOS37839 CH22.6323FG_UNICEMAC005500.GENSCAN.57-1 

60 CH22^E{WAC005500.GENSCAN.57-l 1.7 

304084 EOS04015 T91986 EST singleton (not in UniGene) with exon hit 1.7 

332539 EOS32470 AA412528 Hs.20183 ESTs; Weaidy similar to cDNA EST EI\ABL:T01421 comes linun this gens [C^ 1.7 

314332 EOS14263 A1J037551 Hs.95612 ESTs 1.7 

321412 EOS21343 AW366305 EST duster (not in UniGene) 1.7 

65 312187 EOS12118 AA700439 Hs.188490 ESTs 1.7 

314147 EOS14078 A1656135 Hs.129805 ESTs 1.7 

303131 EOS03062 AW081061 Hs.103180 actin^keO 1.7 

331341 EOS31272 AA303125 Hs.119009 ESTs; WeaWy similar to Ull ALU SUBFAMILY SB2 WARNING ENTRY l!ll[asap!ensl 1.7 

313615 EOS13546 AW295194 Hs.25264 DKFZP434N1 26 protein 1.7 

70 329598 EOS29529 c10j>2gi|3982482igb|Agn4i>3992440220ex23CDSi a71 297420 

CH.10_p2gl|3962482 1.7 

303579 EOS03510 AA381124 Hs.193353 ESTs; WeaMysbnilar to till ALU SUBFAMILY J WARNING ENTRV till (asaptens] 1.7 

331692 EOS31623 W93592 H3.47343 ESTs 1.7 

323977 EOS23908 AW328177 Hs.234713 ESTs 1.7 

75 332930 EOS32861 CH22_151FGJ8J.UNieC20H11GENSCAN.2^ 

CH2?J^GENES.38 4 1.7 
326596 EOS26527 c19_hs gil6138928IrBfl gn 4 133386 133563 ex 7 9 CDSi -1.32 178 3520 

Cai9JisgiI8138928 1.7 

314946 EOS14877 AI097229 Hs.217484 ESTs; WeaWysInular to Ull ALU SUBFAMILY J WARNING ENTRY Ul![H.sapiensJ 1.7 

80 315357 EOS15288 AAfi08684 Hs.121705 ESTs; Moderately stotitar to Ull ALU CLASS C WARNING ENTRY 1111 (Rsapiens] 1.7 

324728 EOS24659 AA303024 EST duster (not In UniGene) 1.7 

317501 EOS17432 AA931245 Hs.137097 ESTs 1.7 

332219 EOS32150 N22S08 Hs.139315 ESTs 1.7 
335369 EOS35300 CH22J718FG_543_7_UNieEMAC005500.GENSCAN.432-9 

85 CH2^GENES.&43 7 1.7 

322417 EOS22348 W36288 Hs.171873 ESTs; WeaWysimilaFto PUTATIVE STEROID DEHYDROGENASE KIK-IIM^nusa^ 1.7 
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316100 EOS16031 AW203388 Ks.213b03 ESTs 1.7 

314886 EOS14797 AW305124 Ks.1916B2 ESTs 1.7 

300328 EOS0Q2S9 AW015860 Hs.224623 ESTs 1.7 

315876 EOS15607 AW0D2565 HS.136S90 ESTs f.7 

314183 EOS14114 AA748600 ESTGh]stBr(notb)Uni6en8) 1.7 

321354 EOS21285 AA078493 EST duster (not in UniGene) 1.7 

311904 EOS11835 T86907 Hs.119371 ESTs 1.7 

322890 EOS22^1 AA082030 ESTciustBr(riotinUniGeii8) 1.7 

302759 EOS02690 AIB85815 Hs.184727 ESTs 1.7 

324600 EOS24531 AA503297 Hs,117t08 ESTs 1.7 

314973 EOS14904 AW273128 Hs.254669 EST 1.7 

324432 EOS24383 AA464510 EST duster (not In UniGene) 1.7 

331520 EOS31451 rM9068 Hs.93966 ESTs 1.7 

308380 EOS08311 AI623g88 EST singleton (not in UniGens) with exon Ml 1.7 

331010 EOS30941 H95039 Hs^68 KIAA0442 protein 17 
325383 £0825294 c1ZJisgq5866920|r8qgn7i>700448700516ex68CDSi-&5871113 

Caia.hsgi|5866920 1.7 

310470 EOS10401 A1281848 Hs.165547 ESTs 1.7 

330711 EOS30642 AA164687 Hs.177576 mannosyl(alplia-1;3-).gtycoprolein beta.1:4-N-acetytgIucosami^^ 1.7 

332074 EOS32005 AA599012 Hs^826 ESTs 1.7 

309732 EOS09663 AW262211 Hs.5662 guanine nudeottde binding protein (G protein); t)etapolypepOde2-iike1 1.6 

306337 EOS06268 AA954221 Hs.73742 ribosomal protein; iaige; PO 1.6 
335189 EOS35120 CH22_2525FG_507_4.UNK_EM:AC005500.GENSCAN.4004 

CH2?^R3ENE&507J 1.6 

316253 EOS16184 AI919537 Hs.1180S6 ESTs 16 
332908 EOS32839 CH22_129FG_36J^UNK_C20H12.GENSCAN.28-9 

CH22JGENES.36J2 1.6 

310002 EO$09S33 A1439096 Hs.25832 ESTs 1.6 

332258 EOS32189 N68670 Hs.103808 ESTs; Weakly similar to RanBPM[H.sapiens] 1.6 
335182 EOS36113 CH22^3576FG 715_2«UI^IK_DAS9H18.GENSCAN.19.3 

CH22^FGENES.715J 1.6 
328987 EOS28918 c.9Jisgi|5868535|reqgn1- 25705 25764 ex 3 10 CDS! 9.9060438 

CH.09Jisgi|5868535 1.6 

324481 EOS24412 AI916284 Hs.199671 ESTs 1.6 

331406 EOS31337 AA6100S4 Hs.23440 KIAA1 105 protein 1.6 

332280 EOS32211 R^IOO Hs.106294 ESTs 1.6 

332173 EOS32104 F09281 Hs.90424 ESTs 1.6 
335739 EOS35670 CH2^.31O2F6_601JO_UNieEM:AC0055G0.GENSCAN.491-10 

CH2?_FGENES.601J0 1.6 

332104 EOS32035 AA609177 Hs.109363 ESTs 1.6 

315033 EOS14964 AI493046 Hs.146133 ESTs 1.6 
334740 EOS34671 CH2?.2052FGL424J5_UNK_E1VIAC005500.GENSCAN.285-17 

CH22.FGENES.424.15 16 
334783 EOS34714 CH2?J095FG_432.6.UNK.EM:ACO05S00.GENSCAN.293-11 

CH2?JGENES.432J 1.6 

308010 EOS07941 A1439190 Hs.181165 euioryotte translation elongation factor 1 alptia 1 1.6 

304521 EOS04452 AA464716 EST Singleton (not in UniGene) with exon hit 1.6 

318719 EOS18650 Z25900 Hs.18724 Homo sapiens mRNA;cDNADKFZp564F093 (from done DKFZp564F093) 1.6 

321920 EOS21851 N63915 EST duster (not in UniGene) 1.6 

315019 EOS14950 AA532807 Hs.105822 ESTs 1.6 

320793 EOS20724 AL049980 Hs.184216 DKFZP564C1 52 protein 1.6 

305371 EOS05302 AA714180 EST singleton (not In UniGene) witii exon hit 1.6 

30S054 EOS04985 AA634127 Hs.182426 rlbosonud protein 82 1.6 

314643 EOS14574 AI587502 Hs.192088 ESTs 16 

308186 EOS08117 A)537940 EST singleton (not In UniGene) with exon hit 1.6 

319371 EOS19302 R00321 Hs.174928 ESTs 1.6 

331700 EOS31631 Z40011 Hs.180582 ESTs 1.6 

316955 EOS16B86 AW203959 Hs.149532 ESTs 1.6 

314961 EOS14892 AW008061 Hs.231994 ESTs 1.6 

336676 EOS36607 CH22L4154FG_43_4. CH22_FGENES.4a4 1.6 

322801 EOS22732 A1831910 Hs.163734 ESTs 1.6 

303383 EOS03294 A1964095 Hs.226801 ESTs; Weakly sindlar to DIArlse protein [Ksaplens] 1.6 
328105 EOS28038 c.6Jis giI5868020Iref}gn1 1-301705 301784 ex 4 7 CDS &30 80 3147 

CH.06_hsgl|5868020 1.6 
325481 EOS25412 c12Jisgi|58569S7|rBf]gn3-f 47590 47672 ex 4 7 COS! 169831895 

Cai2.hsgi|5866957 1.6 

315361 EOS15292 AI335229 Ks.122031 ESTs 1.6 

324902 EOS24833 D31323 Hs.211188 ESTs 1.6 
336018 EOS3S949 CH22L340iFG_669_7.UNK.DJ32110.GENSCAN.9-1 2 

CH22_FGENES.669J 1.6 

308747 EOS08678 AI804500 Hs.181165 eukaryofic translaBonebngaBon Mori alpha 1 1.6 
328251 EOS28182 c_6Jis gi|6381891|reqgn 4 ^124444 124557 ex 2 3 COSi 0.401144554 

ai06Jisgq6381891 1.6 

303153 EOS03084 U09759 Hs.8325 ndtogen-adlvated protdn Mnase 9 1.6 
327809 EOS27740 c_5_hs gil5867968(reflgn 3 + 54610 54761 ex 44 COSI 0.78152993 

ai05Jisg?5867968 1.6 

314107 EOS14038 AA806113 Hs.189025 ESTs 1.6 

300304 EOS00235 AIS37934 Hs.224978 ESTs 1.6 

313009 EOS12940 W52010 Hs.191379 ESTs 1.6 
331074 EOS31005 R08440 yf19i9^1 Soares fetal liver spleen INRS Homo sapiens d^NA dons IIMAGE:127337 3" sM^ 

contains Ahi repetitive element;. mRNA sequence 1 .6 
335773 EOS35704 CH2^3142FG_607_9_UNKJM:AC005SOO.GENSCAN.49&4 

CH22_FGENES.607_9 1.6 
334991 EOS34922 CH2^J312FGJ69J1_UNK.E^tAC005500.GENSCAN.365■11 

CH22J^GENES.469_11 1.6 

322959 EOS22890 A1287606 EST duster (not in UniGene) 1.6 
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323731 EOS23662 AA^14 EST duster (not in UniGene) 1.6 

331073 EOS31004 R07S38 Hs.18828 ESTs;Wtealdyamaar to 191 ALU SUBFAMILY J WARNING ENW 1.6 

313573 EOS13504 At0762S9 HS.1S0337 ESTs 16 

316S49 EOS16880 AA856749 Hs.124620 ESTs 1*6 
328084 EOS28015 c.6JiS gq6469819|iBf)gn 3 -155386 155459 6x1 4 CDSII^ 94 2982 

CR0U)5gi|6469819 1.6 

331526 EOS31457 N4MS7 Hs.46624 ESTs 1.6 

317987 EOS17918 AW138174 Hs.130651 ESTs 1.6 
325594 EOS25525 c13J)sgi|S866992|reftgn 4 - 470474 470566 ex 2 3 COS! 8.099368 

Cai3_hsgil5866992 1.6 

310848 EOSt0779 AI459554 Hs.161288 ESTs 1.6 

309268 EOS09199 A)985821 Hs.62954 femttn; heavy potypepiide 1 1.6 

304518 EOS04449 AA461438 EST singtelon (not fat UidGene) with axon tlil 16 

331065 EOS30996 N90584 Hs.9167 Homo sapiens done 25085 mRNA sequence 1.6 

306501 EOS06432 AA987294 EST singleton (not In UniGene) with exon hit 1.6 

323269 EOS23220 AL134235 H&222442 ESTs 1.6 
334630 EOS34561 a^2^1938FG_416_6^UNK.Ei\^:AC005500.GENSCAN.277■6 

CH22„FGENES.416_6 1.6 

30^25 EOS01956 A!091466 Hs.127241 DKF2P564F052 protein 16 
328998 EOS28929 CL9Jsgi|5868538|ief|gn U40996 41104ex 1 3COSf1100109480 

CH.09_hsgi|5868538 16 

313197 EOS13128 AI738851 Hs.222487 ESTs 16 
338763 EOS38694 CH2^.7585FG_UNK.EM:AC005500.GENSCAN.517-16 

CH22^liftAC005500.^SCAN.517-18 16 

332247 EOS32178 N58172 Hs.109370 ESTs 16 

316724 EOS16655 AA810788 Hs.123337 ESTs 16 

303306 EOS03237 AA215297 EST cluster (not in UniGene) with exon hit 16 

306336 EOS06267 AA954198 EST singleton (not in UniGene} with exon hit 16 

308256 EOS08187 A)565498 EST singleton (not in UniGene) with exon hit 16 

307056 EOS06S87 AI148675 EST singleton (not in UniGene) with exon hit 16 

321370 EOS21301 AJ227900 EST cluster (not in UniGene) 16 

336262 EOS36193 CH2a.3661FG_754.9_UNK.DA59H18.GENSCAN.57.11 

CH2^FGENES.754.9 16 
335497 EOS35428 CH22J849FG_571J_UNieEM:AC00550a6ENSCAN.460-26 

CH22_FGENES.571_5 16 

309582 EOS09513 AW16g657 EST singleton (not in UniGene) with exon hit 16 

329563 EOS29494 c10_p2gil3962490IgblAgn 1-410 635 ex 22 CDSf 13.80 226 267 

Cai04)2gi|3982490 1.6 

332504 EOS32435 AAfl53917 Hs.15106 chromosome 14 open reading frame 1 16 

3080S0 EOS0B021 A1474601 Hsi186 eul^aryotic translation elongafion factor 1 ganvna 16 

331752 EOS31683 AA287312 Hs.191648 ESTs 16 

330881 EOS30812 AA132988 Hs.69321 ESTs; Weakly similar to SiniBIar to mudn and several other Ser-Thr-rteh proteins [Sxere^ 16 

315G47 EOS15576 AA648983 Hs.212911 ESTs 16 

336766 EOS36697 CH22.4341FGL143J0_ CH2LFGENES.143-20 16 

302592 EOS02S23 AA294921 H&250811 \Ka) simian leutomia viral oncogene hom6logB(ras related; 6TPt)indin 16 

315076 EOS15Q07 AI623817 Hs.168457 ESTs 16 

337056 EOS36987 CH22_4946FG_441J_ CH22_FGENES.44U 16 

322175 EOS22106 AF085975 EST ciuslsr (not in UniGene) 16 

336833 EOS36764 CH22_4504FG_242_2_ CH22_FGENES.242-2 16 
334902 EOS34833 CH22_2219FG_452_16_UNieEMAC005500.GENSCAN.34M9 

CH22_FGENES.452.16 16 

318671 EOS18602 AA188823 Hs.212621 ESTs 16 

308064 EOS07995 A1469273 Hs.181165 eukaryofic translation elongaOon factor 1 alpha 1 16 

320559 EOS20490 AB021981 H^.159322 solUtB carrier My 35 (aoP-N-acetrfgluoosamfneCUDP^^ 16 

317881 EOS17812 Ai827248 Hs.224398 ESTs 16 

313078 EOS13009 N4973D EST duster (not in UniGene) 16 

338689 £0333620 CH2^7464FG_UNK.EM:AC005500.6ENSCAN.47M 

CH2a.EMtAC006500.GENSCAN.475^ 16 

311804 EOS11735 AA135159 Hs.203349 ESTs 16 

316359 EOS16290 AI472213 Hs.123415 ESTs 16 
330182 EOS30113 c_4_p2 g]15123954|emb|gn 4 -^120156 120245 ex 22 COSl 4.69 90 11 

Ca04j)2gi|5123954 16 
334718 EOS34649 CH22J028FG_421_29.UNK.EMAC005500.GENSCAN.282-29 

CH22_FGENES.421_^ 16 

324196 EOS24127 AA405524 Hs.178000 ESTs 16 

305350 EOS05281 AA706676 EST singleton (not in UniGene) with exon hit 16 

331469 EOS31400 N22273 Hs.39140 ESTs 16 

305715 EOS05646 AA826384 EST singleton (not In UniGene) with exon hit 16 

314460 EOS14391 AI263231 Hs.145607 ESTs 16 

317634 EOS17565 AA953088 Hs.127550 ESTs 16 
335293 EOS35224 CH2?J635F6_527XUNl^M:AC005500.GENSCAN.421-9 

CH22J^GENE&527J 16 

305611 EOS05542 AA782331 EST singleton (not In UniGene) vrith exon hit 16 

310430 'EOS10361 A1670843 Hsi00257 ESTs 16 

323698 EOS23627 AA641201 Hs.222051 ESTs 16 

300610 EOS00541 N72596 Ks.99120 DEAD/H(Asp^u^a-Asp/Hl3) box polypeptide; Y chromosome 1.6 
327364 EOS27295 cjjis ffl6552412Ireqgn 2- 115235 115396 ex 1 9 CDSI i77 1623007 

CR01JisgiI6552412 16 

324848 EOS24779 AW021857 EST duster (not in UniGene) 16 

321491 EOS21422 H70665 Hs.1B39«) ESTs 16 
336367 EOS36a8 CH22.3779FG.818jl_UNieBA232El7.GENSCAN.3.17 

CH2UH3ENE&818J1 16 

331549 EOS31480 N56866 H&237507 EST 16 
328332 EOS28263 c_7Jisgi|5868375|ref|gn6> 280154 280289ex35CDSi-104136516 

CH.07_hsgii5868375 15 

322817 EOS22748 002420 EST duster (not in UniGene) 15 
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303983 EOS03914 AW514111 Ks.181165 eukaryoOc translaSon etongaSon tetor 1 alpte 1 1.5 
329434 EOS29385 c^Jtsgi|586^|reqgn1- 31 124 31 263 ex 3 20 CDS! 6.38140241 

CH.YJisgi|5868883 1.5 
^ 338196 EOS38127 CH2a.6763RL.UNieBAAC005mGajSCAN.23S-1^ 

5 a^EMAO005500.6ENSCAN.235-16 1.5 

308488 EOS08419 Ai68214d Ks.179651 fiimo sapiens ctona 24703 be{a^bufinmlVM;compJ^ 15 

314883 EOS14814 AW178807 tte.246182 ESTs 1.5 

307095 EOS07Q26 AI187910 EST singleton (not In UniGene) with exon hit 1.5 

^. 306953 EOS06884 AI124971 EST singleton (not in UniGena) with exon hit 1.5 

10 331788 EOS31717 AA398539 Hs.97369 EST 1.5 

303509 EOS03440 AW378236 ^fe.256050 ESTs 1.5 

324515 EOS24446 AW501666 Hs.163539 ESTs 1.5 
339323 EOS39254 CH22.8284FG.UNieBA35411^GENSCAli23-2 
CH2^JA354l1ZGENSCAN.23-2 1.5 

15 308563 EOS08494 AA99S296 ESTsbigleton(nol[nUniGen8)wahexonhIl U 

316076 EOS16007 AW297895 Hs.116424 ESTs 1.5 
325622 EOS25553 c14 hsgi|5867000|reqgn 2 -^69994 70075 ex 6 8 CDS) 9.4082194 

Cai4JsgiI5887000 1.5 

309632 EOS09563 AW193261 Hs.1561t0 tmrmjnoglobullntcappavarial)]e1D.8 1.5 

20 314926 EOS14B57 Al^0838 Hs.124835 ESTs 1.5 

314456 EOS14389 AI217440 Ks.143873 ESTs 1.5 
335219 EOS35150 CH2?J558FG.513^^.U^^K_EM:AC00550aGENSCAN.408.2 

CH22_FGENES.513_2 1.5 

301079 EOS01010 AA305a47 Ks.183654 ESTs; Weafdy similar to unknown (S-oerevisfae] 1.5 

25 334122 EOS34053 ai22.1400I^_333J_UNK.EM:AC00550aGENSCAN.185-27 

CH2LFGENES^3J 1.5 

308139 EOS08070 AI494477 EST singleton (not in UniGene) with exon hit 15 

317412 EOS17343 AI301528 Hs.132604 ESTs 1.5 

315073 EOS15004 AW452948 Hs.257631 ESTs 1.5 

30 313139 EOS13070 AA362113 EST cluster (not In UnlGene) 1.5 

307012 EOS06943 AI140212 EST singleton (not liiUnKSene) with exon hit 1.5 

322895 EOS22826 AW470295 Ks.192152 ESTs 1.5 

303779 EOS03710 AA897298 Hs.221266 ESTs 1.5 

312344 EOS12275 AI742618 Hs.181733 ESTs; Weakly similar to nitrilase homolog 1 [H.sapiens] 1.5- 

35 323632 EOS23563 AL039950 EST cluster (not in UniGene) 1.5 

332336 EOS32267 T96130 Hs.137551 ESTs 1.5 

304547 EO804478 AA486189 EST singleton (not in UniGene) with exon hit 1.5 

335692 EOS35623 CHZL3053FG_596_7_UNICEM:AC005500.GENSCAN.488-7 

CH22^F6ENES.596_7 1.5 

40 328333 EOS28264 c.7Jis gt|5868375|ref)gn 6 ^ 282506 282664 ex 4 5 C[)SI 7.71 159517 

CH,07Jsgil5868375 1.5 

304143 EOS04074 R88737 EST singleton (not In UniGene) with exon hit 1.5 

3»625 EOS29555 cl1j>2gl|4567169|gblAgn2 - 8589385984ex35COSI Z249229 

CH.11j)2gi|4567169 1.5 

45 329960 EOS29891 c16_p2gil5091594lgbtAgn M031 1162 ex 1 3 CDS! 1075132415 

Cai6j>2gil5091594 1.5 

318975 EOS18906 Z44110 EST cluster (not in UniGene) 1.5 

321875 EOS21806 N49122 EST cluster (not In UniGene) 1.5 

320451 EOS20382 R26944 Hs,180777 Homo sapiens mRNA; cONA OKFZi)564M0264 ((nom ctone DK^ 1.5 

50 336020 EOS35951 CH22_3403FG_669J_UNKJ)J32I10.GENSCAN.9-14 

CH2i.FGENES.6693 1.5 

332581 EOS32512 T28799 Hs.2913 EphB3 1.5 
338622 EOS38553 (m.7384FG^UNK.EM:AC00550aGENSCAN.45M 

CH22_EM;AC005500.GENSCAN.451-1 1.5 

55 330397 EOS30328 D14659 Hs.154387 K1AA01 03 gene product 1.5 

314359 EOS14290 AA205569 Hs.194193 ESTs 1.5 

313456 EOS13387 AW380579 Hs.209657 ESTs 1.5 

318486 EOS18417 H09123 Hs.139258 ESTs 1.5 

318175 EOS18106 AA644624 EST cluster (not In UniGene) 1.5 

60 335684 EOS35815 CH22.3045FG.595XUNieEM:AC005500.G£NSCAN.487-13 

CH2^FGENES.595_4 1.5 
327814 £0327745 C.5ji8gi|5667968|r8f|gn6+ 6937770566ex1 2COSf86.151190999 

CH.05.hsgi|5867968 1.5 

322120 EOS22051 W84351 Hs.213846 ESTs 1.5 

65 311749 EOS11680 R06249 Hs.13911 ESTs 1.5 
329797 EOS29728 c14_p2 gi|8523160|emb| gn 1 - 10616 10894 ex 3 6 CDSt 5.86 279 1549 

Cai4j)2gf|6523160 1.5 

330630 EOS30561 )(78669 Hs.79088 re&docaQ^in 2; EF-hand calcium blmOng domain 1.5 

303777 EOS03708 AA348491 EST cluster (not In UniGene) with exon hit 1.5 

70 309856 EOS09587 AW197060 Hs.l95188 glyceratdehyde^phosphate dehydrogenase 1.5 
326165 EOS26098 c17Ji8 9qS867208|ref)gn 2 - 62787 62929 ex 1 10 GOSI a87 143 2037 

Cai7Jtegi|5B67208 1.5 

308328 EOS08^9 AI590571 H8.186412 EST 1.5 

300601 EOS00532 AI762130 Hs.165619 ESTs 1.5 

75 303610 EOS03541 AA323288 EST cluster (not In UniGene) vnih exon hit 1.5 

307856 EOS07787 AI366158 EST singleton (not In UniGene) with exon hit 1.5 

319920 EOS19851 R54575 Hs.13337 ESTs; Wealdy similar to similar to Phosphoglucomutase and phosphom^^ 

phosphoserine [Celegans] 15 

332167 EOS32098 D573B9 Hs.75447 ralA binding prol^ 1 1.5 

80 316427 EOS16358 A1241019 Hs.145644 ESTs 1.5 

303886 EOS03817 AW365963 •ESTduster(notln UniGene) mlh exon hit 1.5 

314292 EOS1422d AA7325gO Hs.134740 ESTs 1.5 

315408 EOS15339 AW273261 Hs.216292 ESTs 1.5 
335698 EOS35629 CH22.3059FG_597J.UNieEMAO00550aGENSC^^ 

85 CH2?.FGENES.S97J 1.5 

315084 EOS15015 A1821085 HS.1B7796 ESTs 1.5 
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1.0 
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1.0 
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ALIOO 4 0CCCA 0O7 OO t iKIt/ CRJ>AAOnccnn ACklCAAM 404 OC 
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UnZ4_/4Mrb_oUoCUNK^tMJ\WAw / 
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1 R 


33S06Z 
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Ol22„rGcNES.48Z«17 
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1.5 


333243 


EOS33174 


<MJ04 AOnCr> 444 7 1 IKft/ Ji A/VtAAAAT /^mo/^A III 4 OA C 

OI2IL482pG_i 1 1_7_IJN1\„EM:AC000097.6ENSCAN. 1 2XyQ 








CH22_FGENtS.1 1 1_7 


1.5 


306380 


EOS06311 


AA968861 EST singleton (not in UniCSene) with exon hit 


1.5 


320809 


EOS20740 


Ai540299 EST Cluster (not in UniGene) 


4 C 
1.5 


332813 


EOS32744 


CH2<^FG_8_1JJNK^C65E1.GENS(IAN.2-2 








CH2iLrGENES.8_l 


4 C 
•1.5 


335817 


EUS35748 


^IJ<V) 0400F/^ e40 C 1 IfcllX EllaA^AACeAA ^EMC^AU e4A C 

Cn22.o189F6.Dl oJ5_LJNi\JEMMC005500.(3ENSCiAN.51 0-5 








CH2^FGENES.618.5 


4 e 
1.5 


04flee4 
319351 


EUS194BZ 


AA7ol 668 EST cluster (not in UniGene) 


4 e 
1.5 


33447« 


cUoo4403 


AUOO 4774CA OQA 0 1 IKII/ ChJ^AO/lftCCAO /^CktCAAM OC7 0 

unZ2_1 771 ruL994.3_UNivcMAuJUo50u.Gi:NoUA 








^UOO CACKICO IQA 0 


4 C 

1.0 


333029 


EOS329dO 


C3i22_255rGjD8_o_iJNi\_EM:AC000097. C3ENSCAN.40-3 








/HJOO C/^CMCO £0 0 

Cn22^rGcl>lt5.DB„o 


4 C 
1.0 


oOwo5 


EOoU79oo 


Al4ooU9l rl$.ll9z5z tuHior protein; transJationaiiy-controusa 1 


4 C 

1.0 


302882 




AtAf/nOOOA COT mIi.mIav Sm i Im!/^m>.<%\ ••«tk kU 

AW403330 EST tiuster (not in UniGene} witn exon nit 


4 C 

1.5 


314033 


cOS139o4 


AA4C740C CCT aIiioIm /hm* Im 1 lMt/^iu<A% 

AA1 671 25 cST Cluster (not in untGene; 


4 C 

1.5 


324928 


r* Ann J 0 m 

EOS24a59 


AJ9322oS H$.160559 ESTs 


4 C 

1.6 


ooncoii 
329524 


E05294o5 


m4A nO MtlOADOCA7l<«kl A C ODAOC 004 40 m«« O O /^nci O 4A 440 47A 

Cl0j)2 gi|39B3507igD|A gn 6 • 38025 38143 ex 3 3 CuSI Z40 119 170 








CR10_p2gi|3983507 


1.5 


333131 


CrtOOOflCO 


C>l22«3o0rG_83_o_UNA^EJvtAC00ure7.GtNS 








lAA ^^P*Alr*n AA A 

CH22_FGENES.83„6 


1.5 


332085 


EOS32016 


AA600353 Hs.1 73933 ESTs; Weakly similar to NUCLEAR FACTOR 1/X [H.sapiens] 


1.5 


305369 


EOSO5300 


AA714040 EST singleton (not in UnKSene) witn exon hit 


4 C 

1.5 


30(044 


EO0OO275 


AIAfOn4 4 07 Llo 04 0CC0 COTm 

AW291487 HSwc13659 ESTS 


R 

1.5 


0010174 

j25(/fi 




n03o99 coT cwstBT (not 10 uniGenoj 


1 R 


323693 


cUS23d24 


AIAI0&77CD LLk Ojtn704 COTm 

AWira775o HS.249721 ESTS 


1 R 

1.5 
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1.5 


oooein 
3Z2oiU 




At 100919 EST ciister (noi in uniGene) 


1 R 
1.9 


ooeooo 
335932 




ALIOO OCT7C/^ C4C C 1 IMt/ Ckl. A^V^ACCAA ACKICAAM AOC_0 

CnZ<_Zb77pe_o35jD.LJ Nr\_cM>\C005500.GcN5UAN.4zo* 








Atioo CACMCC eoe c 
C^Z<J^cNtio.535jD 


1.0 


307565 


EOS07496 


A1282468 EST singleton (not in UniGene) with exon hit 


4 c 
1.5 


04 J4 JA 

01414U 


cU0i*tur1 


Ai41fiA70 IJn4C40Q7 CCTf» 

BUiwo nS.1 54297 to IS 


1 R 

1.0 


323011 


CACOOQ^O 
CU0^94£ 


A ACftflOOD COT jJ<H.tM» /mU Im i ltiS£*MtfA 

AAoBD26o EST dustST (not ui UniGene) 


1.9 


325366 


CUS25297 


—40 k» «^coeen'w\i_fl ^a. n aoaaca (M474 0 «• 4 o /v^oi 4 e oc 7CO 4C7 

Cl<_ns gtl5o66920{rei| gn 9 • 920982 921713 ex 1 8 UjSI 15.95 752 167 








CH.12J1S gil5666920 


4 c 

1.5 


322306 


ertO*>ooo7 


tAnrCAOC U> 4 4CAaO ceT« 

W75935 Hs.146083 ESTs 


4 C 

1.5 


311034 


cnc4nocc 
EQblinjOO 


AICC4AOO 4744C7 COTy.. US^kh. ^tntJI^.^ lilt/r»0 r\ TVDC tl IkTTCADAI kJCtlQDALtC DD/^TCIM HJ 

A1&O4023 nS.1714o7 ESTs; nJgnly Sumiar to NKG2-U T Yrb U IN I cGRAL McMBtv\NE PROTEIN [Ksaptens] 


4 C 

1.5 


30^81 


CACAC/14 0 


A Afijl4C00 CCT ritrtrAt^ttKw^ fnni k% 1 iMlf^MnAt hrI1« AVAR hU 

AAD41 638 EST sngleton (not m UniGene; wiin exon nn 


4 C 

1.5 


322933 


CACOOfiR^ 


AAU99709 EST Quster (not ui uniGone) 


4 c 

1.5 


335221 


eAeoc4 M 
K)S35ioZ 


Cn2<_Z560FGj51 3_4_lJNi\.EM:A(X05500.(ANSCAN.4084 
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4 C 

1.5 


304948 


CACn/OTO 

cUolM079 


A AC404n7 COT •kkRlMbt.. /•»• kk 1 I»IAm>b\ ttfllh avMti htl 

AAol 91 07 EST singieion (not m untGene; witn sxon nn 


4 e 
1.5 


334900 


EOS34831 


CH22 2217FG 452 14 UNlCEMAC0QSS00LG£NSGAIi341-17 








CH22JGENES.452_14 


1.5 


318404 


EOS18335 


AI654108 Hs.135125 ESTs 


1.5 


339358 


EOS39269 


CH2^.8328FG UNK.,aA354l1ZGaiSCAN.31-3 








CH223A354l1Z(^CAN.31-3 


1.5 


327074 


EOS27005 


c21Jis gi|6531965^q gn 58 > 4039993 4040096 ex 3 4 COSl 0.68 104 1284 
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ai21_hs 936531965 1.5 
328054 EOS25985 c17Ji5gq5867184|ieqgn 2 -146342 146469 ex 3 4 03811000 128 4% 

Cai7.hS8i|58S7184 1.5 
326892 EOS26823 c20Jts^8882511|r6qgn 5 1-11942411^8x2930 0081^89 77 2313 

5 CH.20Jisgq6682511 1.5 

328767 E0828698 cjjis giI601 7031 Ireqgn 1-35625 35723 ex 4 4 COSf &63 99 5262 

CH.07jBgil6017031 1.5 
337772 EOS37703 ai22.6125FG_UNieB(1:AC000097.GENSCAN.119-11 

CH22^EM:AC000097.GE»mN.119.11 1.5 

10 312199 EOS12130 AW438S02 Hs.t91179 ESTs 1.5 

303506 EOS03437 AA340605 Hs.105887 ESTs 1.5 

325176 EOS25107 T52843 EST duster (not In UniGene) 1.5 

302023 EOS01954 AF060S87 H5.126782 sush|.rep8at protein ^ 1.5 

305833 EOS05764 AA857836 Hs.181165 eukeryofo transtaBon ebngaQon factor 1 alpha 1 1.5 

15 309131 EOS09082 AI929175 H&119122 ribosomal protein LI 3a 1.5 
334184 EOS34115 a<22.1465F(L350J5_UNieEM:AC00550aGENSCANmi7 

CH22J^NES.350J5 1.5 
335188 EOS35119 CH2^24F6.507_3.UNieEM:AC005500.GENSCAN.40(^ 

CH22_FGENES.507.3 1.5 

20 304813 EOS04744 AA584540 EST dngleton (not faiUi^e) with exonhtt 1.5 

315359 EOS15290 M608808 Ks^5118 ESTs 1.5 

324434 EOS24365 AA707249 Hs.98769 ESTs 1.5 
327910 EOS27841 c_6Jis giI5868162|reli gn 1 * 21622 21748 ex 6 7 CDSI 3.69 127 449 

CH.0UisglI5868162 1.4 
25 335671 EOS35602 (^22.3031F6_592.3.UNK.EMA(»05500.GENSCAN.48S4 

CH22.FGENES.592_3 1.4 
334943 EOS34874 CH2aj264FG.465XUNK.EM:AC0C5500.^SCAN.359^ 

CH2^F(^ES.465_8 1.4 
326393 EOS26324 c19jisgl|58673411r8l|9n2 + 4170241841 ex55CDSi2ai5140504 

30 CR19JsgiI5867341 1.4 

305296 EOS05227 AA687181 EST singleton (not in UniGene) with exon hit 1.4 

307243 EOS07174 AI1999S7 EST singleton (not in UniGene) with exon hit 1.4 

320066 EOS19997 AW364885 Hs.1 12442 ESTs 1.4 

311465 EOS11396. AI75B660 Hs.206132 ESTs 1.4 

35 302822 EOS02753 AW404176 H3.111611 ribosomal protein 127 1.4 

304987 EOS04918 AA618044 EST singleton (notin UniGene) with exon hit 1.4 

330892 EOS30823 AA149579 Hs.118258 ESTs 1.4 
333385 EOS33316 CH2?.631FGJ43_24_UNK_EM:AC005500.GENSCAN.24.18 

CH2?_FGENES.143.24 1.4 

40 302626 EOS02557 AB021870 EST duster (not in UniGene) with exon hit 1.4 

318042 EOS17973 AW294522 Hs.149991 ESTs 1.4 
339361 EOS39292 CH2?_8331F6_UNK_BA354t12.GENSCAN.32-3 

CHZLBA3S4I12GENSCAN.32-3 1.4 

309000 ^08931 AI660489 EST singleton (not In UniGene) with exon hit 1.4 

45 308004 EOS05935 AA889992 EST singleton (not in UniGene) with exon hit 1.4 

329539 EOS29470 c10_p2gi|3g83503|gb|Ugn 1 -1 326ex1 3COSI41.66328212 

CH.10_p2gi|3983503 1.4 

313663 EOS13594 Ai953261 Hs.169813 ESTs 1.4 

323538 EOS23469 AW247698 EST cluster (not in UniGene) 1.4 

50 337595 EOS37526 CH2a.5884FG_UNieC20H11GENSCAN.8-1 

CH2^C20H12.GENSCAN.8-1 1.4 

303149 EOS03080 AA312995 EST cluster (not in UniGene) with exon hit 1.4 

308484 EOS08415 AI679292 EST singleton (not in UniGene) with exon hit 1.4 

300912 EOS00843 AW138724 Hs.168974 ESTs 1.4 

55 315158 EOS15089 AA744438 Hs.142476 ESTs; Weakly dmitar to fill ALU CLASS D WARNING ENTRY lIUfH^piens] 1.4 

300462 EOS00393 AA746S01 Hs.14217 ESTs 1.4 

312730 EOS12661 AI804372 Hs,208661 ESTs 1.4 

316868 EOS16799 AI860898 H$.195602 ESTs 1.4 
337629 EOS37560 CH22»5933FG_UNieC20H12.GENSCAN.28-35 

60 CH22^C20H1ZGENSCAN.28-35 1.4 

332518 EOS32449 D16562 Hs.1 55433 ATP synthase; H> transporting; mitochondria! F1 complex; gamma polypeptide 1 1.4 

337422 EOS37353 CH22.5624FG_760JL CH22_FGENES.760-2 1.4 
328835 EOS28766 c.7_hs gil5868339Iret|gn 5 ^88053 88461 ex 33 COSl 13.78 409 5775 

CR07>gii5868339 1.4 
65 338282 EOS38213 CH22_6897FG_UNK_EM:AC005500.GENSCAN.29M 

CH2a.EMAC005500.GENSCAN.291.4 1.4 
337895 EOS37826 CH22^6303FG_UNK.EKlAC005500.GENSCAN.56-2 

CH22.EM:AC005500.GENSCAN.56-2 1.4 

320330 EOS20261 AFD26004 Hs.141660 chloride channel 2 1.4 

70 314302 EOS14233 AA813118 Hs.163230 ESTs 1.4 

313280 EOS13211 AI285537 HS.222B30 ESTs 1.4 
333222 EOS33153 CH2a.459FGJ05_2.UNK.EM:AC000097.GENSCAN.109^ 

CH22_FGENES.105_2 1-4 

305726 EOS05557 AA828158 EST singteton (not in UniGene) with exon hit 1.4 

75 312674 EOS12605 AI762475 Hs.151327 ESTs; Moderately similar to IIU ALU SUBFAMILY J WARNING ENTRY l!II[H.saplens) 1.4 

315869 EOS15800 AI033547 Hs.13^26 ESTs 1.4 
327010 EOS26941 c21Jisgl|5867664Iref|gn 12^ 941057 941 139 ex 9 9 CDSI 7.4483790 

CHL21Jisgil5867664 14 
325892 EOS25823 c16Js gi|58670M|rBflgn 1-10498 10552 ex 2 3 CDSi 3.94155870 

80 CH.16JisgiI5867088 14 

302575 EOS025C6 AF071164 Hs.249171 homeolwxAII 14 

301970 EOS01901 ABQ28982 HS.12Q245 KlAA1039protetn 14 

332207 EOS32138 H61475 Hs.237353 EST 14 

316024 EOS15955 AA707141 Hs.193388 ESTs 14 

85 314599 EOS14530 AW206512 Hs,186998 ESTs 14 
333585 EOS33516 CH22JB46FGJ203_4_UNKJEM:AG005500.GENSCAN.74^ 
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CH22_FGENES.203J 1.4 

324670 EOS24601 A1525557 EST duster (rat in UnlGene) 1-4 

321307 EOS21238 ^5409 EST duster (not in UnlGene) 1-4 

335170 EOS35101 CH2^25(ffiFGL503J.UNK.£M:AC005S0a6ENSCA^ 

5 CH22JGENES^J 14 

328274 EOS28205 c 7J)sgii^6d219|r6qon2 . 3124431439ex1 11 CD9ia061969 

CK07_hsgiI5B68219 1.4 

336880 EOS36811 CH22_4619FG 318X CH22^F6ENE&318^ 1.4 

313825 EOS13756 AA215470 EST duster (not In UmGens) 14 

10 318410 EOS18341 Ai138418 Ks.144935 ESTs 1.4 
335361 EOS35292 attW710F(L541JLUfn(_EM:AC005500.GENSCAN.431-16 

CH22LFGENES.541J1 1.4 

319802 EOS19733 AI701489 Hs.202501 ESTs 1.4 
334769 EOS34700 CH22.2081FGJ29XUNieE[^C(X)5500.GENSCAri290-9 

15 (»12iJGENES.429j4 1.4 

312709 EOS12640 AW069181 Hs.141146 ESTs;We^s!mBartotransfbmiafion-f8latedp^ 1.4 
330004 EOS29935 c164)2 g!|6623963|9b|Agn 5 - 78872 78999 ex 2 6 CDSi 19.93128728 

CH.16j)2gI}6623963 1.4 

313103 EOS13034 AI184303 Hs.143806 ESTs 1.4 
20 328359 EOS26290 c18Jsgi|5867293|ie1|gnU9436 9494ex23COSI Z165988 

CH.18_hsgil5867293 1.4 

305211 EOS05142 AA668563 EST singleton (not in UniGene) with exon hit 1.4 

334828 EOS34559 CH2a_1936FG 416_4 UMK.EMJVC005500.GENSCAN.277-4 

CH22JGENES.416 4 1.4 
25 326919 EOS26850 c21J)s gl]64567e2M6n 2- 40486 41046 ex 1 5 CDS! 17.70561 157 

CH^I hsgl|6456782 1.4 

315527 EOS15458 AI791138 Hs.116768 ESTs 1.4 

308090 EOS06021 AA908609 EST singteton (not in UniGene) wilh exon hit 1.4 

303316 EOS03247 AF033122 Hs.14125 p53 regulated PA26 nudear protein 1.4 

30 303642 EOS03573 AW299459 EST duster (not in liniGene) wilh exon hit 1.4 

314357 EOS14288 AA781795 Hs.122587 ESTs 1.4 

337102 EOS37033 CH2^5033FG_47a_7 CH22_FGENES.472-7 1.4 

304384 EOS04315 AA235482 Hs.62954 femfin; heavy pdypepUde 1 1.4 

315117 EOS15048 AA828609 Hs.192044 ESTs 1.4 

35 305750 EOS05681 AA835250 EST singleton (not in UniGene) with exon Nt 1.4 

311726 EOS11657 AW081766 Hs.253920 ESTs 1.4 
326996 EOS26927 c21Js giI58676601rB!jgn 4 - 63212 63404 ex 2 6 CDS1 15.70193622 

Ca21_hsgi|5867660 1.4 
330257 EOS30188 c 5^)2 gii6671 881 |gb|Agn 2- 143228 143393ex 1 9 COS1 11.31 166586 

40 CR05_p2gi|6671881 1.4 

323864 EOS23795 AA340724 Hs,214028 ESTs 1.4 
338204 EOS38135 CH2a.6773FG_UNK-ER*AC005500.GENSCAN,241-3 

CH22.£M:AC005500.GENSCAN.241^ 1.4 

314025 EOS13956 AI983981 Hs.189114 ESTs 1.4 

45 315974 EOS15905 AW029203 HS.1919S2 ESTs 1.4 
335599 EOS35530 CH22_2957FG 581.39_UNK^EIWAC005500.GENSCAN.476.37 

CH22_FGENES.581_39 1.4 
335364 EOS35295 CH22_2713FG_643 2-UNK_EI\(tAC005500.GENSCAN.432^ 

CH22_FGENES.543_2 1.4 

50 303634 EOS03565 Ai953377 Hs.169425 ESTs; Weaidy sintiiar to predicted using Genefinder [Celegans] 1-4 

315626 EOS15557 AA808598 Hs.35353 ESTs; Weaidy similar to H21P03.2[C.8l8gansl 1.4 
329936 EOS29867 c16j)2 glI5165200Igb|A gn 4 - 82761 82920 ex 3 4 COSi 1.15 160 199 

CH.16j)2gi|6165200 1 4 
328632 EOS28563 c_7Jisgi[58682471ref|gn U 76734 76853 6x1 4 CDSf 13.951203764 

55 CH.07Jsgi|5868247 1.4 

330207 EOS30138 gl5^2 gP13606bb|A gn 3 - 109912 110004 ex 2 4 CDS! 6.54 93 174 

CH.054>2gi|6013606 1.4 
329919 EOS29850 c16^2 gi|6223624bb|A gn 6 • 103492 103681 ex 1 8 CDSI 6.18 190 93 

Cai6j>2gi|6223624 ,1.4 

60 331916 EOS31847 AA446131 Hs.124918 ESTs 1.4 

317617 EOS17548 T58194 EST duster (not in UniGene) 1.4 

331943 EOS31874 AA453418 Hs.178272 ESTs 1.4 

306413 EOS06344 AA973288 EST singleton (not In UniGene) with exon Int 1.4 

313607 EOS13538 N94169 Hs.194258 ESTs; Moderately stmiiar to l!it ALU SUBFAMILY SO WARNING EffTRY 111! [H.sapiens) 1.4 
65 338292 EOS36223 'CH22_3691FGJ83_3_UNieBA354I12.GENSCAN.4.7 

CH22LFGENES.783.3 1.4 

330453 EOS30384 H63976-HT4246 Pou-DomdnDna Binding Factor Rtl.PHuUary-Spedflc 1.4 

324602 EOS24533 AA503620 Hs.213239 ESTs . 1.4 

332183 EOS32114 H08225 Hs.177181 ESTs 1.4 

70 320032 EOS19983 AI899772 Hs.202361 ESTs; Weakly similar to X-Onked retinopathy protein [H.sapi6ns] 1.4 
333156 EOS33087 CH2^387FG_89.6_UNieEM:AC000097.GENSCAN.84« 

CH22-FGENES.89_6 1.4 
334156 EOS34087 ai2i.1435FG_340XUNICEM:AC00550aG£NSCAN.190.7 

CH22_FGENES.340J 1.4 
75 334303 EOS34234 CH22_1594FG 373_6_UNieEM:AC005500.GENSCAN.233^ 

CH22J^GENES.373_6 1.4 
325513 EOS25444 c12Js gij6017035|reli gn 1 - 34295 34490 ex 2 7 CDSI 6.49 196 2471 

CH.12_hsgiI6017035 1.4 

302758 EGS02689 AA984563 EST duster (not In UniGene) with exon hit 1.4 

80 329557 EOS2948B c10j>2gl|3g62492igb|Agn6 - 53197536476x22COSf37.68451 247 

ai10j)2gi|3962492 1.4 

331717 EOS31648 AA190888 Hs.15^1 ESTs; Highly sNlar to NY-REN^2 antigen |H^a(»ens] 1.4 
325885 EOS25816 c16Jis giI5867087ir6fign 11 .1-193212 193377 ex 1 3 CDSf 43.19 166 792 

C»<.16Jsgil5B67087 1.4 

85 312160 EOS12091 AASQSSOi Ks.184371 ESTs 14 
328882 EOS28813 cjjisgi|6552423|i8qgn 2- 157669 157826 ex 4 6 COSi 4.91 1586200 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80. 
85 



Ca07JiS8i|S552423 

339028 EOS38959 CH2^792SF6_UNK.DA59H1&GENSCAN.22-8 

CH2U>A59H1&GENSCAN.22^ 

323497 EOS23428 A1523613 ^te^1544 ESTs 

316897 EOS16828 AA838114 EST cluster (not in UniGene) 

312479 EOS12410 Ai950844 K&128Z38 ESTs; Weai^sininar to nonto beta ganunaH^iyslalEnlto 

338535 E0838466 CH2ZJ251FG.UN}C^C005500.G£NSCAN.404-3 

Oi2^.EMAC0C550aGENSCAN.404^ 

312754 EOS12685 R99834 Hs.250383 ESTs 

327527 EOS27458 c_^hsgil6381882M9n 2 . 98950 99040ex48CDSI &7891 1768 



324714 
302347 
338008 

315590 
320825 
300930 
335225 

337303 



325472 

301266 
330901 
313406 
301454 
317269 
338876 



EOS24645 
EOSQ2278 
EOS37939 

EOS15521 
EOS20756 
EOS00861 
EOS35156 

EOS37234 
EOS17129 
eOS08922 
EOS25403 

EOS01197 
EOS30832 
EOS13337 

Eosoisas 

EOS17200 
EOS38807 



328481 EOS28412 



314022 
307640 
315541 
315489 
327815 



EOS13953 
EOS07571 
EOS15472 
EOS15420 
EOS27746 



339319 EOS39250 



322564 
323812 
303540 
337902 

335289 

327919 

337674 

320087 
334939 

303443 



327745 

335166 

324497 
338374 

313601 
321415 
305309 
330447 
308578 
315344 
330503 
308227 
332222 
323961 
314530 
320503 
306820 
304165 
324302 
319128 
317092 



EOS22495 
EOS23743 
EOS03471 
EOS37833 

EOS35220 

EOS27850 

EOS37605 

EOS20018 
EOS34870 

EOS03374 
EOS25860 

^27676 

EOS35097 

EOS24428 



331433 
333348 



EOS13532 
EOS21346 
EOS05240 
EOS30378 
EOS08S09 
EOS15275 
EOS30434 
EOS0d158 
EOS32153 
EOS23892 
EOS14461 
EOS20434 
EOS08751 
EOS04096 
EOS24233 
EOS19059 
EOS17023 
EOS04^9 
EOS31364 
EOS33279 



AA829774 

M157818 Hs^38380 

A1248314 Hs.132932 
AI751738 

AA906411 Hs.127378 



AW452420 
Ai301992 
A1168233 
AA628245 



AA574312 Hs^45737 ESTs 
AF03940O H5.184S59 chbiridBChann^catefumac&vatedtM^ 
(M22^6490FG_UNK-EM:AC005500.GENSCAN.127-9 

CH2LEAftAC005500.6ENSCAN.127-9 
AA640637 Hs.225817 ESTs 
N!yL004751 EST ctusler (not In UnlGene) 

A1289481 Hs.136371 ESTs 
CH22L_2564FG.513 10_UNieEMAC005500.GENSCAN.406-9 

CH22_FGENEa5l3.10 
GH22»5442FG«681_5_ CH22_F6ENES.681-5 
Ai810384 KS.12602S ESTs 

A1879831 EST singleton (not in UniGens) with exon liit 

ciajiS8l|6017034|ref| on 7 - 289581 289657 ex 2600^ 4.74771786 
CH.1ILhsgi|6017034 
EST duster (not in UnlGene) with exon hll 
Human endogenous reirovirai protease mRNA; complete cds 
ESTs 

EST duster (not In UniGene) with exon hit 
ESTs 

CH2^7733F6_UNKJXJ32I10.GENSCAM4.2 

CH22_OJ3310.G£NSCAN.4'2 
cjr.hs gi158684491ref| gn 1 • 8987 9180 ex 4 31 COSi 10.00 194 2103 
CH.07_hsgi|5868449 
Hs,248678 ESTs 

EST singleton (not In UmGene) mth exon hit 
Hs.123159 ESTs; Weakly similar to K1M0668 protein [H-sapiens] 
Hs.191847 ESTs 
cj.hs gl|5867968|refl gn 6 + 70804 71401 ex 2 2 CDSI 27.99 598 1000 

CH.05_hsgl|5867968 
CH2:L8280FG_UNieBA354l12.GENSCAN.22-19 

CH2LBA354I1ZGENSCAN.22-19 
W88440 Hs.118344 ESTs 
AW081373 Hs.199199 ESTs 

AA35S607 H5.'l735d0 ESTs; Weakly similar to MMSET type I [Ksapiens] 
CH22_6314FG_LiNieEM:ACC05500.GENSCAN.56-13 

CH22_EMAC005500.GENSCAN,56-13 
CH22_2631FG_527_^UNK.eM:AC00550aGENSCAN.421-2 

CH22J^GENES.527_2 
c_6_hsgl|5868165|ref|gn 6 + 547701 547800 ex 14 14 COSI -0.20 100 505 

Cli06Jsgi|5868165 
CH22L600SFG_UN/CEM:AC000097.(XNSCAN.674 

CH2a_EM:AC000097,6ENSCAN.67-4 
AF032387 Hs.1 1 3265 small nudear RNA activating complex; polypeptide 4; 190kD 
CH22_2259FG_465_3^UNK.EM-AC00550aGENSCAN.359^ 

CH22L.FGENES.465J 
AA320525 EST cluster (not in UnlGene) with exon hit 

c16Jis gl|5867125Iref} gn 2 - 51715 51998 ex 1 1 CDSo 29.05 282 1594 

CH.16Jisgl|5867125 
C.5J1S gi|65319S9|rei| gn 1 • 229066 229124 ex 3 6 CDSI 3.01 59 177 

Ca05>gij653ig59 
CH2^^502FG.50%.10.UNK^M:ACOOS500.GENSGAN.3g8-25 

CH22LFGENES.502L10 
AW152624 Hs.136340 ESTs 
(»i22J017FG_UNK.EM:AC005500.GENSCAN.327-1 

CH22-EM:AC005500.GENSCAN.327-1 
ESTs 

transmembrane 4 superfamlly member 1 
EST singleton (not in UniGene) with exon hit 
Pre-Mma Splicing Factor S{2p33. AIL Splice Fonn 1 
EST singleton (not In UniGene) with exon ha 
ESTs 

Human cell surface glycoproleln P3.5d mRNA» parfi^d cds 
glyceraidetiydfr^-phosphate dehydrogenase 
ESTs 
ESTs 
ESTs 

EST chister (not In UnlGene) 
EST singleton (not in UniGene) with exon hit 
EST s^ton (not in UnlGene) with exon hit 
ESTs: Weakly similar to 1111 ALU SUBFAMILY J WARNING ENTRY UII 
EST duster (not in UkilGene) 
ESTs 

EST singleton (not In UniGene) with exon hit 
EST 



R32458 Hs.257711 
AI377596 Hs.3337 
AA699717 
HG3546W3744 
AI708573 



AW292176 
M55Q24 
A1559126 
N28271 
AL044428 
AI05235B 
NM.005897 
AI074408 
H73285 
AA543008 
AA393820 
A12e6162 
AA&21203 
H68097 



Hs.245834 

Hs.195188 
Hs.176618 
Hs.207345 
Hs.131741 



Hs.136808 
Hs.125657 



Hs.161023 



CH2Z.594FGJ40JLUNieEM:AC005500.GENSCAN.20-2 



1.4 

1.4 
14 
1.4 
1.4 

1.4 
1.4 

1.4 
1,4 
1.4 

1.4 
1.4 
1.4 
1.4 

1.4 
1.4 
1.4 
1.4 

14 
1.4 
1.4 
1.4 
1.4 
1.4 

1.4 

14 
1.4 
1.4 
14 
14 

14 

14 
1.4 
14 
1.4 

14 

14 

14 

14 
14 

1.3 
1.3 

1.3 

13 

13 

13 

1.3 
13 
13 
1.3 
13 
13 
13 
1.3 
1.3 
1.3 
13 
13 
13 
13 
1.3 
13 
13 
13 
13 
13 
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CH2U^3ENES.140J 1^ 
333819 EOS33550 C^880FGJ19.UIN)eBtAO00^G£NSCAN^^ 

ai2?J6ENES^19_3 1^ 
335903 EOS35834 (»22.3280FG.635J1.UNKJBAAC00550aGENSCAN^25-^ 

CH2ZJGENES.635J1 1.3 
326219 EOS28150 c17Jisgl|5867226Hgn 11 -2640G8 264274 ex 35 COSI &74 267 2847 

Cai7J»gi|5857226 1.3 

324456 EOS24387 AWai0954 EST cluster (no) in UniGene) i3 

316405 EOS16336 AA757900 Hs^24 ESTs 1.3 

314361 EOS14292 AL038765 H&161304 ESTs 1.3 
328546 EOS28477 cjjis gq5888487Ireqgn 1-17547 17722 ex 23 CDSi 9.96176 3284 

CH.07J» 5115868487 1.3 
335871 EOS3S802 a{2^J3246FGJ29J9.U^aeEM:AO(1055Q0.GENSCA^^^ 

CH2LFGENES.629J9 1.3 

303735 EOS03866 AA707750 H5.202616 ESTs; Weakly slmaar to ds.GotgimatrUim>tBinGM130(Rnonregicus] 1.3 

324048 EOS23979 AA378739 EST cluster (no) in UniGene) 1.3 

326720 EOS26651 c20_hs giI6552456|ref)gn 1*84525 84677 ex 5 7 COSi 11.78 153 1031 

Ca20>g!I6552456 1.3 

322309 EOS22240 AR)86372 EST duster (not in UniGene) 1.3 

322136 EOS22067 AF075083 EST cluster (no) In UniGene) 1.3 

313460 EOS13391 AWQ28655 Hs.136033 ESTs 1.3 

^75 EOS06m AA936312 EST singleton (not in UniGene) with exon hit 1.3 

321974 EOS21905 N76794 EST duster (not in UniGene) 1.3 

3Z7600 EOS27531 cJIJsgiI6004462|reqgn 1-2621 2862 ex 1 4 CDSI -4.01 2421407 

CR03.hsgi|6004462 1.3 
329086 EOS29017 c_x_hs gi|5868604|ref| gn 1 - 35489 35588 ex 2 9 CDSi 2.55 100 719 

CH^Qil5868604 1.3 

336919 EOS36850 CH2^.4690FG.346_6_ CH2?_FGENES.346^ 1.3 

302767 EOS02698 H94900 Hs.17882 ESTs 1.3 
334786 EOS34717 CH22J098FG_432_11 UNieEIW:ACC05500.GENSCAN.293-14 

CH22JGENEa43a_11 1.3 

302472 EOS02403 AA317451 H5.241451 SWI^^1F related; matrix assodated:actin dependent regulator of chromatin; sub^ 1.3 
333033 EOS32964 CH22_259FG.68JJJNieEMAC000097.GENSCAN.4a8 

CH22.FGENES.68.8 1.3 

330493 EOS30424 M27826 Hs.238380 Human endogenous retrovira) protease mRNA; complete ods 1.3 

330506 EOS30437 M61906 Hs.6241 phospholnositide^ldnase; regulatory sul>unit; polypepb'de 1 (p85 at^^ 1.3 

313932 EOS13863 A1147601 Hs.154087 ESTs 1.3 

314394 EOS14325 AI380563 Hs.130816 ESTs 1.3 

323033 EOS22964 A1744284 Hs.221727 ESTs 1.3 
326431 EOS2S382 c19_hsgli5867371|rel|gn 1> 15855 15971 ex 4 6 00^ 7.791171108 

Cai9Jisgil5867371 1.3 
335547 EOS35478 CH22J902FGJ76_8.UNK_EMAC005SOO.GENSCAN.467-8 

CH22_FGENES.576.8 1.3 

300548 EOS00479 AI026836 Ks.1 14689 ESTs 1.3 

316504 EOS16435 AW135854 Hs.132458 ESTs 1.3 
335756 EOS35687 CH22„3123FGJ04_5_UNieEMAC005500.GENSCAN.493-10 

CH22_FGENES.604J 1.3 

301209 EOS01140 A1809912 Hs.l59354 ESTs 1.3 

306610 EOS06541 AI000635 EST singleton (notin UniGene) with exon hit 1.3 

314439 EOS14370 AI539443 Hs.137447 ESTs 1.3 

315396 EOS15327 AW296107 Hs.152686 ESTs 1.3 
335914 EOS35845 CH22_3291FG.636JO_1JNK-EMAC005500.GENSCAN.526-10 

CH22JGENES.636J0 1.3 
333734 EOS33665 CH22_1000FG.260_2_UNK.EM:AC005500.GENSCAN.119.7 

CH2^FGENES.260_2 1.3 

312370 EOS12301 AA744692 Hs.166539 ESTs 1.3 

304636 EOS04567 AA524031 EST singleton (not in UniGene) with exon hit 1.3 

323166 EOS23097 AA291001 EST duster (not in UniGene) 1.3 

338702 £0338633 CH2?.7482FG_UNK_EM:AC00550a6ENSCAN.480.1 

CH22_EM:AC005500.GENSCAN.480-1 1.3 

322331 EOS22262 AF086467 EST duster (not in UniGene) 1.3 

318706 EOS18637 A1383593 Hs.159148 ESTs 1.3 

331186 EOS31117 T4115g Hs.8418 ESTs 1.3 
334764 EOS34695 CH22_2076FG 428_13_UNieEM'AC005500.GENSCAN.289-13 

CH22JGENES.428J3 1.3 
327565 EOS27496 c 3_hs gil5867811|ref| gn 1 + 32516 32778 ex 2 3 COSi 0.20 263 368 

CK03JisgI|5867811 1.3 
335524 EOS35455 CH22_2879FG^572.4.UNK.EM:AC005500.GENSCAN.4614 

CH22JGENES.57^.4 1.3 

308050 EOS07981 A1460004 EST singleton (not in UniGene) with exon hit ^ 1.3 

334172 EOS34103 CH22_1452FG.349.5_UNK.EM:AC005500.GENSCAN.208.6 

CH22_FGENES.349.5 1.3 

315674 EOS15605 AA651923 Hs.191850 ESTs 1.3 
334876 EOS34807 CH22_2190FG.450_6_UNJCEM:AC005500.GENSCAN.339.6 

CH22JGENES.450_6 1.3 

315606 EOS15537 AW298724 Hs.202639 ESTs 1.3 
338779 EOS38710 CH22_7610FG_LINieEiyi:AC005500.GENSCAN.526-15 

CH22.EM:AC0a5500.GENSCAN.526-l5 1.3 
333511 EOS33442 CH22^766FG_17L5_UNieEMAC005500.GENSCAN.51-5 

CH22JGENES.171.5 1.3 
329254 EOS29185 cjUisgil58687331reflgn 1-^413342146x1 2 COSi -0.38 82 2833 

Ca>Chsgi|5868733 1.3 

319510 EOS19441 W88633 Ha.254562 ESTs 1.3 
339418 EOS39349 CH22_8411FG_UNieDJ579Nl6.GENSCAN.114 

CH22JXI579N16.GENSCAN.1M 1.3 

321012 EOS20943 AA737314 EST duster (not in UniGene) 1.3 
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333217 EOS33148 ai2^4MFGJ04XUNieEMAa)(N)097.GENSCAN.1^ 

CH2?JGe«E&104.9 U 
33B561 £0838492 (»2^J294FG.UNieE(ytAC00550aGB<SCAN.421^ 

CH2LEM:AC00550aGENSCAN.421^ 1.3 
5 335742 EOS35673 a12^.3109=G.601J3.UNleEM:AC005500.6ENSCAN.49VU 

CH2^GENES.601J3 1.3 
334993 EOS34924 CH22J314FG.469J4_UNK.EMAC005500.GENSCAN.365-16 

CH22.F6ENES.469J4 1.3 

323430 EOS23381 AW062479 ESTcKjsterfnotlnUnlGene) 1.3 

10 306069 EOS06000 AA9089B3 ESTsbgl8ton(not[nUn!Gene)vrilhexonhit 1.3 

331681 EOS31612 W85712 Hs.119571 collagen; tyf» Hf; alpha 1 ^Iers^)anlos syntto 1.3 
337986 EOS37917 CH22.6441FG_UNK-EM:AC00550a6ENSCAN.110.7 

CH2^.EM:AC005500.GENSCAN.110-7 1.3 

313204 EOS13135 A1800518 Hs.118158 ESTs 1.3 

15 323189 EOS23120 AL121194 Hs.120589 ESTs 1.3 

318171 EOS18102 AA3812Q2 EST cluster (not tnUniGene) 1.3 

307156 EOS07087 AJ186762 EST singleton (not in UniGene) with axon hit 1.3 

332713 EOS32644 AA349792 Hs.78489 mu(Y (E. cdi) homofog 1.3 

312828 EOS12759 A1865455 Hs.211818 ESTs; Moderately simOar to ALU SUBFAMILY J WARNING ENTRY HI! 1.3 

20 301127 EOS01058 AA758ia9 Hs.121072 ESTs 1.3 

311260 E0S11191 AI672509 Hs.196582 ESTs 1.3 
338364 EOS38295 CH2?.7007FG_UNK^M;AC005500.GENSCAN.323>7 

CH2:LEM:AC00550aG£NSCAN.323-7 1.3 
337904 EOS37835 CH2?.6318FG_UNieEMAC00550aGENSCAN.56.l7 

25 CH22_EMAC005500.GENSCAN.56-17 1.3 

329347 EOS»278 cjLhsgi|6456785|ref|gn 1 + 18433 18897 ex 44 CDSf 43.39 465 3718 

CHJ(_hsgi|6456785 1.3 

313329 EOS13260 AW293704 Hs.122658 ESTs 1.3 

314367 EOS14298 AA535749 ESTchister{notlnUnl6ene) 1.3 

30 317098 EOS17029 AJ1235i3 Hs.125456 ESTs 13 

306462 EOS06393 AA983397 EST singleton (not in UniGene) with exon hit 1.3 

301254 EOS01185 Aia49624 EST duster (not In UniGene) with exon hit 1.3 

335504 EOS35435 CH2aj856FG.57U5.UNK-EM:ACQ05500.G£NSCAN.460^ 

CH2^JFGENES.571J5 1.3 
35 334270 EOS34201 CH22_1559FG_358J_UNK„EM:AC005500.GENSCAN.228^ 

CH22„FGENES.368J 1.3 
334324 EOS34255 CH22»1616FGJ75J UNK.EM:AC005500.GENSCAN.235.1 

CH2^FGENES.375„1 1.3 

^_ 304254 EOS04185 AA046273 Hs.111334 femtin; light polypeptide 1.3 

40 305731 EOS05662 AA829363 EST singleton (nolln UniGene) with exon hit 1.3 

323284 EOS23215 AA279381 Hs.190010 ESTs 1.3 

322007 EOS21938 AW410846 Hs.165739 ESTs 1.3 
334537 EOS34468 CH22.1839FG.403^2.UNK.EM:AC00550aGENSCAN.288-2 

CH22JGENES.403J2 1.3 

45 302360 EOS02291 AJ010901 Hs.198267 mucin 4; tracheobronchial 1.3 

311641 EOS11572 AI948829 Hs.213786 ESTs 1.3 

324643 EOS24574 AI436356 Hs.130729 ESTs 1.3 
327554 EOS27485 cjjis giI5867801|reflgn 2 -23092 23191 ex 26 CDSl 10.44 100 107 

CH.03_hsgii5867801 1.3 

50 312165 EOS12096 AW292i39 Hs.115789 ESTs 1.3 

304679 EOS04610 AA548741 EST singleton (not in UniGene) with exon hH 1.3 

319564 EOS19495 AA026777 HS.1G9732 ESTs 1.3 

310860 EOS10791 AW015920 Hs.161359 ESTs 13 

337161 EOS37092 CH2a_5180FG.561X Oi22_FGENES.561-3 1.3 

55 311155 EOS11086 AI634410 Hs.197608 EST 1.3 

336846 EOS36777 CH22.4540FGJ63J_ CH22_FGENES^8« 1.3 

310985 EOS10916 T51842 EST cluster (not in UniGene) 13 

329499 £0829430 c10j>2 gi|3983518|gb|Agn 5 -^33463 33789 ex 1 1COSo 34.50 327 97 

CH.10_p2giI3983518 1.3 
60 334924 EOS34855 CH22J2244FG«459XUNK„EM:AC005500.GENSCAN.351-2 

CH22_FGENES.459_2 1.3 

330861 EOS30792 AA084064 Hs.1d5747 ESTs 1.3 

324658 EOS24589 AI694767 Hs.129179 ESTs 1.3 

^_ 323362 EOS23293 AL135067 Hs.117182 ESTs 1.3 

65 330468 EOS30399 L1034d Hs.112341 protease inhibitor 3; skinned (SKALP) 1.3 

314198 EOS14129 AAq97581 Hs.128773 ESTs 1.3 
339436 EOS39367 CH22.8431FG_UNK.DJ579N16.GENSCAN.19-1 

CH2^.DJ579N1RGENSCAN.19.1 1.3 

«^ 312483 EOS12414 AI417526 Hs.184636 ESTs 1.3 

70 321505 EOS21436 H73183 Hs.129885 ESTs 1.3 

332254 EOS32185 N64702 Hs.194140 ESTs 1.3 
328253 EOS28184 c_6Jisgi|6381894|r6f]gn 1-4411 4509 ex 1 5 COSI 4.20994561 

CH.06_hsgil6381894 1.3 

332357 EOS32288 W73417 Hs.103183 EST 1.3 
75 329017 EOS28948 cjUis giI6682532|retl gn 7 - 255591 255672 ex 3 3 COSf 12.94 82 22 

CH.X_hsgil6682532 1.3 

337504 EOS37435 CH22_5739FG^803X CH22J^GENES.803-2 1.3 

316525 EOS16556 AA780307 Hs.122156 ESTs 1.3 
335389 EOS35320 CH22JJ739FG,545J,UNKJMAC005500.GENSCAN.436-1 

80 CH22_FGENES;545_1 1.3 

310017 EOS09948 AI188739 Hs.148488 ESTs 1.3 

314354 EOS14285 AU)37984 Hs.208982 ESTs; WeaMyslmilartoRII ALU SUBFAMILY J WARNING ENTRY I!H(H.sapi^^ 1^ 

324641 EOS24572 AI732515 Hs.189218 ESTs 1.3 
335207 EOS35138 CH22J»46FGL510J^UNK.EMAC005500.GENSCAN.402^ 

85 CH22LFGENES.510.4 1.3 

333673 EOS33604 CH22.934FG.246^5.UNK.EM:AC005500.GENSCAN.101-3 
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CH22LFGENE&246^5 1.3 
334370 EOS34301 CH22.1664R]L378JBJLINKJMACQ0550aGENSCANmi 

CH22JGENEa378 18 1-3 
328890 EOS28621 cjrjisri6588001|ref|gn7-571207571274ex1 3 CDS! 134684325 

CaO7Jisgi|6588001 13 

32^08 EOS23139 AA203415 Hs.136200 BSJs 1-3 

307010 EOS06941 AI1400U EST singleton (no) biUnlGene)virnh6Xon hit 1.3 

316563 EOS1&494 A1587083 H&20(S58 ESTr. Weakly sknila to lUl ALU SUBFAMILY SPWARTflNGBim III! (H^l^^ 1.3 

312219 EOS12150 H7350S HS.1 17874 ESTs 1.3 

319884 EOS19815 T73234 EST cluster (not in UnlGene) 1.3 

334720 K)S34651 a42?J030F6.42U1.Ui^EMAaM)550a(£t4SCAN.28231 

CH2a/G€NES.42U1 1.3 
335836 EOS35767 CH2^3210FGJ21XUI^E^I:ACX)Q550aGE^lSCA^L51» 

CH22JGENES.621_3 1.3 

305448 EOS05379 AA737894 Hs.2d797 ribosomal piotdn LIO 1.3 

314885 EOS14816 AI049878 Hs.133032 ESTs 1.3 

320130 EOS20061 A1820575 Hs.203804 ESTs 1.3 

310567 EOS10498 AI591065 Hs.155780 ESTs 1.3 

323898 EOS23829 AA347566 EST duster (not in UniGene) 1.3 

336132 a)S36063 CH2^3522FG_703_^UNK^OA59H18.6£NSCAN.9-2 

CH22_FGENES,703J 1.3 
337958 EOS37889 CH2^64O3FG_UNK.EMAC005500.GENSCAN.98^ 

<m.EM:AO00S500.GENSCAN.98« 1.3 

305630 EOS05561 AA804508 EST singleton (not In UniGene) with axon hit 1.3 

334916 EOS34847 (>l22J235FG.457.7_UNK^EI^C00550aGENS(m347.1 

CH22_FGENES.457_7 1.3 
333542 EOS33473 CH22L799FGJ78_4_UN»eEMAC005500.GENSCAN.594 

CH22_FGENES.178.4 1.3 

331151 EOS31082 R82331 Hs.164599 ESTs 1.3 

315095 EOS15028 AA831815 Hs.243788 ESTs 1.3 

331593 EOS31524 N72150 Hs.50193 EST 1.3 

323767 EOS23698 AI807408 Hs.166368 ESTs 13 
334561 EOS34492 CH2^.1865FG.4O5J.UNK.BI:AO00S50a6ENSGAN^(>« 

CH2^FGENE$.405J 1.3 

308191 EOS08122 AI538878 EST singleton (not in UnlGene) with exon hit 1.3 

319571 EOS19502 N91399 Hs.220826 ESTs 1.3 

316200 EOS16131 A1914535 Hs^21377 ESTs 1.3 

305996 EOS05927 AA889338 Hs.163356 EST 1.2 

318055 EOS17986 A1249193 Hs.145945 ESTs 1.2 

315570 EOS15501 AI860360 Hs.160316 ESTs 1.2 

320792 EOS20723 AW236504 Hs.247O20 ESTs 12 

331649 EOS31580 W20384 Hs^5412 ESTs; Wea]dysimnartoc29|MjnusculU5] 12 

303839 EO803770 Z45939 EST duster (not in UniGene) with exon hit 12 

324399 EOS24330 AA814768 Hs.21396 ESTs 12 

317172 EOS17103 AI741232 Hs.2C6744 ESTs 12 

312452 EOS12383 A1692643 Hs.172749 ESTs 12 
325482 EOS25413 ciajisgil5866957(reqgn34.47g5748078ex57CDSI10.251221896 

CH.1^hsgl|5866957 12 

311395 EOS11326 R23313 EST cluster (not in UniGene) 12 

336124 EOS36055 CH2^3513FG„701_9_UNfeDA59H18.GENSCAN.8-9 

CH22JGENES.70L9 12 

320082 EOS20013 AA487678 Hs.189738 ESTs 12 

312168 EOS12099 T92251 Hs.198882 ESTs 12 
338000 EOS37931 CH22.6472FG_LINK.EM:AC005500.GENSCAN. 119-5 

CH22„EM:AC00550aGENSCAN.119^ 12 
338852 EOS38783 CH22.7705FG_UNieDJ246D7.GENSCAN.12-1 

CH22_DJ246D7.GENSCAN.12-1 12 

312090 EOS12021 N57592 Hs.118064 ESTs 12 

316480 EOS16411 AI749921 Hs.205377 ESTs 12 
333259 eOS33190 CH22_600FGJ18_7.UNK.EM:AC0055fl0.6ENSCAN.2-7 

CH22_FGENES.118_7 12 
335211 EOS35142 CH22L2550FG.611JLUNieEM:ACQ0550aGENSCAN.403-2 

CH2:LFGENES.511_2 12 

-321950 EOS21881 AA594780 H5.172318 ESTs 12 
337937 EOS37888 CH2?JB370FG_UNK^ACOD5500.6ENSCAN.86-1 

CH22LEM:ACC05500.GENSCAN.86-1 12 

316576 EOS16507 AI732114 Hs.193046 ESTs; Weakly similar to lUI ALU SUBFAMILY J WARNING ENTRY llll [H.sa;rfens] 12 

322770 EOS22701 AA045798 H5.159971 SWI/SNF re!at8(tmalnx associated; acSn dependent regulator of chron^ 12 
329369 EOS29300 cjOis gi|5868842]reflgn 1-121148 121516 ex 3 4CDSI &50 369 3910 

CHJOBgil5868842 12 

304163 EOS04114 H91161 EST singleton (not In UnlGene) wfth exon hit 12 

339370 EOS39301 CH2^8343FG_UNK-BA232E17.GENSCAN.1-12 

CH22LBA232E17.GENSCAN.1.12 12 

303941 EOS03872 AW473878 Hs.156110 Immunoglobulin kappa variatile 108 12 

302245 EOSQ2176 H18835 EST duster (not in UniGene) with exon hit 12 

335255 EOS35186 CH22J597FG,517_^_UNK^EM:AC0C55C0.GENSCAN.411.2 

CH2a_FGENES.5l7J 12 

316610 EOS16541 AW087973 Hs.126731 ESTs 1.2 

314915 EOS14846 AA573072 Hs.187748 ESTs; WeaMysim9ar to lill ALU SUBFAMILY J WARNING ENTRY lJll[R5a|den5j 1.2 

315426 EOS15357 AI391486 Hs.128171 ESTs 12 
334003 EOS33934 CH22_1 281 FG_310J8_UNICEMAC005500.GENSCAN.1 67-27 

CH2LFGENE&310J8 12 

304350 EOS04281 AA186871 . EST singleton (not to UniGene) with exon hit 12 

325173 EOS2S104 AI133215 Hs.144662 ESTs; IMerately similar to llil ALU SUBFAMILY J WARNING ENTRY l!ll[H.8a|te] 12 

312313 EOS12244 AW293341 Hs.122505 ESTs 12 
333366 EOS33297 CH22.612FGJ4^.3JUNieEMAC00S500.GENSCAN.22-8 
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CH22_FG£NES.142.3 1.2 
334970 EOS34901 CH22J291FG 4S6 3.UNK.EM:ACQ0S50aGENSCAM361'2 

CH2^JGENEa486.3 1.2 
338668 EOS38599 CH22.7441FG.JLiNKJEM:AO0055(MKGENSGA^ 

5 CH2?_Eli^C00550aGENSCAN.465.1 1.2 

336502 EOS36433 CH22.3928FGJ33XUNK.(X1579NiaGENSCAN.5-9 

CH22J^GENES^J 1.2 

309438 EOS09389 AW1028Q2 Ks.225787 ESTs;IUtoderatefy similar to hypoitoficaipra^ 1.2 
^. 338194 EOS38125 CH22_3591FG 717.20 UNKJ2A59Hia(£NSCAN.2D.19 

10 CH2aj=GENES.717.20 1.2 

338678 EOS36«)9 CH2a.4156FG 43_6 CH2^/GENE&43^ ' 1.2 

321401 EOS21332 W90406 K5^962 ESTs 1.2 

308026 EOS05957 M902309 ESTsinsl8ton(notinUniGen8)viiQi6XDnh!t 1.2 

336434 EOS36365 CH22_3854FG_826J_UNK.BA232E17.GENSCAN*1 

15 CH22JGENES^J i2 

315257 EOS15188 AW157431 Hs^48941 ESTs 1.2 
328349 EOS28280 c_7JisgP68383[ref|gn 7 - 260704 260804 ex 2 9 COS! 4.37 101 621 

Ol07J)sgil5868383 1.2 
326112 EOS26043 c17JisgiI58671921reflgn 1 +2151 2725 ex 1 1 CDSI 54.87 575 1272 

20 CK17_hsglI5867192 1.2 

333995 EOS33926 CH22^1272FG 310J9^UNICEMAC00550aGENSCAN.167-18 

CH22.F6ENES.310J9 1.2 

323683 EOS23614 AI380045 Hs.225033 ESTs 1.2 
330143 EOS30074 c21j)2 gi|42104301emb|gn 3 +184737 184848 ex 4 4 CDSI 1.71 112111 

25 CH.21_p2gi|42l0430 1.2 

329789" EOS29720 c14_p2 gil6469354|8mblgn 2 -118977 119036 ex 1 3 CDSl 1.19601517 

Cai4_p2gil6469354 1.2 

324397 EOS24328 M307836 Hs.l18758 ESTs; WeaWy similar to RLFfisairfens) 1.2 

308729 EOS08660 A1799766 Hs.208627 EST 1.2 

30 323939 EOS23870 AW499832 Ks.115696 ESTs 1.2 
333444 EOS33375 CH22jB94R3_153J^UNK_EM:AC005500.GENSCAN.34-1 

CH22_FGEhES.153.1 1.2 

306302 EOS06233 AA937901 EST singleton (not In UniGene) with exon hit 1.2 

313693 EOS13624 AW46918D Hs.170651 ESTs 1.2 

.35 316652 EOS16583 AA789249 EST duster (not in UnlGene) 1.2 

332325 EOS32256 T79428 Hs.191264 ESTs 1.2 
336235 EOS36166 CH22L3633FGJ40XUNKJDA59H18.GENSCAN.44-2 

CH22JFGENES.740J 1.2 

319436 EOS19367 R02750 EST cluster (not in UniGene) 1.2 

40 312335 EOS12266 AW043620 Hs.236993 ESTs 1.2 

322109 EOS22040 A1884327 Hs.244737 ESTs 1.2 
328466 EOS28397 c.7Jis gl|5868434Ireqgn 1-15643 15900 ex 1 2 CDSl Z36 258 1608 

CH.07Jsgl|5858434 1.2 

323244 EOS23175 n0731 EST duster (not in UniGene) 1.2 

45 312510 EOS12441 AA779907 H8.117558 ESTs 12 

314853 EOS14784 AA729232 Hs.l53279 ESTs 1.2 

336946 EOS36877 GH22 4731FG 355_2_ CH22J=GENES.355-2 1.2 

303874 EOS03805 AA258921 ESTduster(notinUn)Gene) with exon hit 1.2 

312658 EOS12589 AA730280 Hs.l20936 ESTs 1.2 

50 308354 EOS08285 A)611044 EST singleton (not in UniGene) with exon hit 1.2 

310073 EOS10004 A1335004 Hs.148558 ESTs 1.2 

324777 EOS24708 AA744046 Hs.133350 ESTs 1.2 

300897 EOS00828 AI890356 Hs.127804 ESTs 1.2 

308371 EOS0B302 AI620866 Hs.242510 EST 1.2 

55 306358 EOS06289 AA961821 EST singleton (not In Ui^Gene) with exon hit 1.2 

312295 EOS12226 AA578233 Hs.l73863 ESTs 1.2 

319792 EOS19723 R20317 Hs^68 ESTs 1.2 
338546 EOS3B477 (iH22L7267FG_UNieEMAC005500.GENSCAN.410-1 

CH22.EMAC005500.GENSCAN.410-1 1.2 

60 314546 EOS14477 AW007211 Hs.186672 ESTs 1.2 
338494 EOS38425 CH22_71MFG_UN1CEM:AC005500.GENSCAN.385^ 

CH22 EM:AC005500.GENSCAN.385^ 1.2 

331131 EOS31062 R54797 Ks.26238 EST; Wealdyslndlar to iBverse transcriptase honiotog [H.sapiens] 1.2 

309939 EOS09870 AW419122 EST singleton (not in UniGene) with exon hit 1.2 

65 332932 EOS32863 (>12^_1S3FG 38.6.UNK_C20H1ZGENSCAN.29^ 

CH22_FGENES.38J 1.2 

309653 EOS09584 AWigSSOO H5.180842 nlxisoma! protein LI 3 1.2 

318647 EOS18578 AI526152 EST duster (not in UniGene) 1.2 

304044 EOS03975 T52479 Hs.252259 nTxjscma) protein S3 1.2 
70 330307 EOS30238 c_7j)2gi|48779821gblAgn 2 +107384 107559 ex 2 4 CDSl 9.96 176 4 

CH.07ji2gil48n982 1-2 

314499 EOS14430 ALJ044570 Hs.147975 ESTs 1.2 
338053 EOS37984 CH22JS52FG_UNK.pyfcAC00550aGENSCAN.158-1 

ai2?.EMAC005500.GENSCAN.158-1 1.2 
75 332991 EOS32922 CH2aj15F6 56 4 UNJeEMAC000097.GENSCAN.174 

CH2LFGENES.56,4 1-2 

306308 EOS06239 AA946870 EST singleton (not In UniGene) with exon hit 1.2 

338120 EOS380S1 CH2^6655FG^UNiCEA4:AC(K)5500.GENSCAN.195-1 

CH2?_EM:AC005500.6ENSCAN.195-1 1.2 

80 313703 EOS13634 A1161293 Hs.146862 ESTs; slmOar to KtAA0525 protein [Ksaptens] 1.2 

330563 EOS30494 U50553 H&147916 DEADyH(Asp.au^&Asp/Hs) box polypeptide 3 1.2 
332886 EOS32817 CH2^.108FG_33_7_UNK_C20H1Z6ENSCAN.22-9 

CH22_FGENES.33.7 1.2 

303844 EOS03775 U94362 Hs.58589 glycogento2 1.2 

85 321755 EOS21686 A1215te1 Hs.144042 ESTs 1.2 
333532 EOS33463 CH22.789FGJ75_19.UNKJIAAC005500.GENSCAN.63-25 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



333^ 

317459 
315353 
300732 
303502 
333126 

3329^ 

329502 



315472 
328290 



328662 EOS28593 

319808 E0S19739 
303929 
315712 
307391 
335499 

303792 
327287 

317713 
330137 

308157 
314452 



321487 
320993 
336778 
319827 
308249 
310094 
336902 
339044 

336675 
303563 
330673 
311814 
335481 

314775 
324961 
313458 
307074 
337964 



337366 
322340 
307954 
328615 

317787 



AI510824 

AL042699 

AI567509 

X13075 

AL050145 



CH22JGENE&175J9 
EOS32734 CH22.61FG.2a.3.UNK.C^H11G£NSCAN.18^ 

CH22_FGENES^ 3 e 
EOS33185 GH2L495FGJ18.a.UNteailA(XX^GENS(>Ui2-2 

Q^GENES.11^ 
EOS17390 AI357254 Hs.131248 ESTs 
EOS15284 AW452608 H3.129817 ESTs 
EOS00663 AI369956 Hs.257891 ESTs 

EOS03433 AA488523 EST cluster (not bUn{Gene)wilhexon hit 

EOS33057 CH2^.355FGJ2_3„UNieEMAC000097.GENSCAN.a.10 

CH22_FGENES^J 
EOS32850 CH22»150FG_38 3^UNieC20H12.GENSCAM29^ 

£0829433 c10_p2gil3983517|gb|Ugn1 •••75 338 ex 1 tCDSo48L82 264100 

caiOji2gP83517 
EOS33339 CH2^657FG_145.6_UNK.EMAC005500.GENSCAH2&€ 

CH22_FGENES.145J 
EOS15403 AA828850 Hs.165469 ESTs 

EOS28221 c 7JisgiI5868363|re!]gn2-127356 127496ex15COSl S24131289 

Ca07_hsg!15868353 
c_7 hs giI6004473|req gn 22 + 1184773 1 184855 ex 7 8 CDSi 1Z72 83 3916 

CH.07.hsgil6004473 
T58960 EST cluster (not in UniGene) 

AW470753 EST singteton (not in UniGene) mth exon hit 

AI950133 Ks.120882 ESTs; Moderately similar to M ALU SUBFAAaV J WARNING Ehm^Y 1111 
A)225058 EST singleton (not tnUnlGen8}wih6Xon hit 

(»1ZL2851FG,571_8_UNieEM:ACX)05500.GENSCAN.460.28 

CH2?_F6ENES.57U 
C75094 Ks.199839 ESTs; Highly dmilarb N622 [Rsa^ei^] 
c J Jis gll5867479|refl gn 1 - 62838 63024 ex 4 5 COS! 11 .66 187 1628 

CaOlJis 9115867479 
AI733306 Hs.128071 ESTs 

C21J2 gi|4210430|enit)| gn 1 • 21220 21377 ex 2 3 CDSf 1.69 158 104 
Ca21j>2gi|4210430 
thymosin; he\a 4: X chromosome 
ESTs 

collagen; type ); alpha 1 
EST cluster {not In UniGene) 

Homo sapiens mRNA; cDNA DKFZp586C2020 (from done DKFZp586C2020) 
CH2i.FGENES.1594 
EST cluster (not in UniGene} 
EST singleton (not in UniGene) with exon hit 
ESTs 

CH22LFGENES.331.2 
CH22_7944FG_LlNK.OA59H18.GENSCAN.27-5 

CH22J)A59H18.GENSCAN.27-5 
CH22_FGENES.43^ 

transfomilng growth factor beta-induced; 68kD 
Sec23 (S. cerevisiae) homolog A 
ESTs; Moderately simSar to zinc finger protein (H.sapiens] 
CH2e^833FG 570 10 UNK_EM:AC005500.GENSCAN.4604 

CH22_FGENES.570J0 
AI149880 Hs.188809 ESTs 
AA613792 EST duster (not in UniGene) 

EOS13389 AA007259 Hs.255853 ESTs 

EOS07005 AI150989 EST singleton (notin UniGene) with exon hit 

EOS37895 CH22.6410FG_UNK.EM:AC005500.6ENSCAN.100.9 

CH22L.EMAC005500.GENSCAN.10t>-9 
326519 EOS26450 cl9Jis giI5867439|ref|gn 4 + 166004 166243 ex 45 (iDSi 4.49240 2534 

CH.19Jsgl|5867439 

EOS37297 CH22L5551FGJ36J. CH22_FGENES.736-1 
EOS22271 AF088076 EST duster (not In UniGene) 

EOS07885 AI419892 EST singleton (not In UniGene) with exon hit 

EOS28546 c.7 Jis giI5868239Irefi gn 2 35214 35347 ex 3 4 CDSi 1 1.49 134 3651 

CH.07Jsgi|5868239 
E0S17718 AW339612 Hs.249364 ESTs 
EOS35219 CH2?J263OFG.527J_UNICEMACC0550aGENSCAN.421-1 

CH2:LFGENES.527J 
EOS23106 AI827137 Hs.l84023 ESTs 
EOS30824 AA149620 Hs.71999 ESTs 

EOS06741 AI057294 EST singleton (not in UniGene) with exon hR 

EOS38170 CH22.6833FG_LINK.EMAC005SOO.GENSCANm5 

CH2UMACQ05500.GENSCAN.264-5 
EOS32278 W60326 Hs.221716 ESTs 
EOS09713 AW275156 Hs.156110 Immunogtobullnliappavariai^ 1D^ 
EOS22449 AI133446 EST duster (not in UniGene) 

EOS01118 AA806S42 EST duster (not in UniGene) with exon hit 

EOS12060 AW300B67 EST duster (not in UniGene) 

EOS34645 CH22_2024FG.421J5_UN)l.EJi&AC00550aGENSCANm25 

CH22LFGENES.421.25 
EOS16517 AI205077 Hs.144689 ESTs 
EOS20419 R31386 EST duster (not In UniGene) 

EOS27389 c.^Jis gi]6004455tref| gn 3 + 173257 173378 ex 5 7 CDSi 4.03 122 1 184 

CKOZJisgi|60M455 
EOS36638 CH2^.4212FG^64X CH22JGENES.64.3 
EOS13492 AA040155 EST duster (notin UniGene) 



ECS15643 
EO807322 
EOS35430 

EOS03723 
EOS27218 

EOS17644 
EOS30068 

EOS08088 
EOS14383 
EOS08199 
EOS21398 
EOS20924 
EOS36709 
EOS19758 
EOS08180 
EOS10025 
EOS36833 
EOS38975 

EOS36606 
ECS03494 
EOS30604 
EOS11745 
EOS3541.2 

EOS14706 



Hs.75968 
Hs.209222 
Hs.172928 

Hs.225986 
CH22-4367FGJ59„4_ 
T62778 
AI560998 

AW450967 Hs.235240 
CH2^4655FG.331J. 



CH22J153FG..43_3_ 
AA367699 Hs.11 8787 
D57823 H8.92962 
AW377113 Hs.119640 



323175 
330893 
306810 
338239 

332347 
309782 
322518 
301187 
312129 
334714 

316586 
320488 
327458 

335707 
313561 



1.2 
1.2 

1.2 

1.2 
1.2 
1.2 
1.2 

1.2 

1.2 

1.2 

1.2 
1.2 

1.2 

1.2 
1.2 
1.2 
1.2 

1.2 

1.2 
1.2 

1.2 
1,2 

1.2 
1.2 
1.2 
1.2 
1.2 
1.2 
1.2 
1.2 
1.2 
1.2 
1.2 

1.2 
1.2 
1.2 
1.2 
1.2 

1.2 
1.2 
1.2 
1.2 
1.2 

1.2 

1.2 
1.2 
1.2 
1.2 

1.2 
1.2 

1.2 
1.2 
1.2 
1.2 

1.2 
1.2 
1.2 
1.2 
1.2 
1.2 

1.2 
1.2 
1.2 

1.2 
1.2 
1.2 
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33(»08 EOS30837 AA169498 Hs.72804 ESTs U 

330987 EOS30918 K40988 Hs.131965 ESTs; VMdydn^ to lUlAUISUBFAAlOLY J WARNING ENTf^UllIH^^^ U 

325041 EOS24972 A1809182 Hs.130907 ESTs 1.2 

. 313225 EOS13156 AA5Q2384 Hs.151529 ESTs 12 

5 305295 EOS05226 AA687131 ESTsin8l8ton(notfaiUitiGen8)w'lhexonhil 1.2 

306898 EOS06827 AI093383 ESTsln9l8ton(nollnUnI6ene)wilh6XonhH 1^ 

326981 EOS28912 c21Jis 8i|6588016|iBqgn 3 ^105091106038 ex 11 CDSo 122.69 948 567 

Ca21Jisgi|8588016 1.2 

332225 EOS32156 N33213 Hs.1004K ESTs 1.2 

10 318802 EOS18733 R19443 Hs.92414 ESTs 1.2 

318413 EOS18344 AI138592 Hs.144936 ESTs 1.2 

312292 EOS12223 AW451893 Hs.151124 ESTs 1.2 

323753 EOS23684 M3271Q2 ESTchisterCnotlnURiGsne) 1.2 

313582 EOS13513 AW207684 Hs.13583 ESTs 1.2 

IS 317836 EOS17767 AA983913 Hs.128929 ESTs 1.2 
332868 EOS32799 CH22_86F6.28.8.UNK.C20H12.GENSCAN.18^ 

CH22.FGENES.28J 1.2 

336924 EOS36855 CH22^4699FG^347_9« CH2^FGENES.347-9 1.2 
327791 EOS27722 c.5Jisgl|5867977|reflgn 1 +22491 22610ex67CDSi 1U9120 658 

20 Ca05^hsgl|5867977 1.2 

330717 EOS30648 AA233926 Hs^3635 ESTs 1.2 

322944 EOS22675 AA112573 EST duster (not In UrdGene) 1.2 

312108 EOS12039 T82331 Hs.127453 ESTs 1.2 

332570 EOS32501 AA401376 Ks.26176 ESTs U 

25 330880 EOS30811 AA132420 Hs.53542 KIAA0986 proton 1.2 

310341 EO$10272 AW302773 EST duster (not in UniGene) 1.2 

334012 EOS33943 CH22^1290F6J13J_UNK^MAC0a550aGENSCAN.169^ 

CH22.FGENES.313_3 1.2 

318230 EOS18161 AA558125 EST duster (nd in UniGene) 1.2 

30 336071 EOS36002 CH22.3457FGJ85XUNieDJ32)iaGENSCAN.21-6 

CH22_FGENES.685_3 1.2 
338510 EOS38441 GH2^7208FG_UNlCEM:AC005500.GENSGAN.391-22 

CH2L0AACOO55OO.GENSCAN.391-22 1.2 
334487 EOS34418 CH2a,1786FGJ95J_UNK.EM:AC00550aGENSCAN.258-10 

35 CH22_FGENES.395^9 1.2 

320661 EOS20592 AA884846 EST duster (not in UnlGene) 1.2 

335200 EOS35131 CH22_2538FG 508_9_UNK_EM:AC00550aGENSCAN.40^9 

CH22LFGENES.508J 1.2 
333582 EOS33513 CH22_842FGJ01.2-UNK_EMAC0C5500.GENSCAN.72-3 

40 CH22_FGENES.201_2 1.2 

320789 EOS20720 R78712 EST duster (not in UnlGene) 1.2 

321185 EOS21116 H51659 H$.189854 ESTs 1.2 
337740 EOS37671 CH22_6085FG_UNK.EMAC000097.GENSCAN.100^ 

CH2^EM:AC000097.GENSCAN.100^ 1.2 

45 315064 EOS14995 AA775208 Hs.136423 ESTs 12 
334883 EOS34814 CH22_2197FG_451_6_UNK_EM:AC005500.GENSCAN.34(W 

CH22.FGENE&45U 12 

331825 EOS31756 AA411144 Hs.104768 ESTs . 12 

319141 EOS19072 F12377 EST duster (not in UniGene) 11 

50 333682 EOS33613 CH22^944FGJ47^10„UNK_EMAC005500.GENSCAN. 102-10 

CH22_FGENES.247J0 11 
336140 EOS36071 CH22_3530FG_705j_LlNieDA59H18.GENSCAN.10.2 

CH2ajGENEa705J 11 

320727 EOS20658 U96044 EST duster (not In UniGene) 11 

55 323947 EOS2387e AA649842 H5.186667 ESTs 11 

324746 EOS24677 AA603367 Hs.222294 ESTs 11 

306744 EOS06675 AI031882 EST singleton (not in UniGene) with exon bit 11 

328517 ^26448 c19Jiso[|5867439[refi9n U 44732 46356 ex 66 CDS1 148.221625 2512 

CR19Jisgil5867439 11 
60 333597 EOS33528 CH2?.858FGJ211J_UNieEM:AC005500.GENSCAN.79-5 

CH22_FGENES.21L5 11 
. 330135 EOS30066 c21j)2 8t|44564701embign 2 -121583 121885 ex 22 CDSf 16.67 303102 

CH.21_p2gil4456470 11 

315118 EOS15049 AA564921 Hs.143899 ESTs 11 

65 302893 EOS02824 AL117539 Hs.173515 Homo sapiens rnRNA; cDNA OKFZp586H021 (fron) done DKFZ^ 11 

337169 EOS37100 CH2?.5189FG_563_L CH22_FGENES.563-1 11 
336121 EOS36052 CH22_3510FGJ01_6„UNK_DA59H18.GENSCAN.W 

CH22_FGENEa701_6 11 

323332 EOS23263 AI829520 Hs.227513 ESTs 11 

70 320911 EOS20842 A1056872 Hs.133388 ESTs 11 
327990 EOS27921 c.6Jis gi|5868218Iref| gn 2 - 36225 36503 ex 1 2 CDS1 16.35 279 1419 

CH.06Jisgl|5868218 11 

320425 EOS20356 C14069 Hi201627 ESTs; htoderately similar to Illl ALU SUBFAMLYSQ WARNING ET^RTUllIKsap^ 11 
327075 EOS27006 c21Jisgi|65319651ref|0n 58 + 404131840414316x440081 1791141285 

75 CH.21_hsgi|6531965 11 

314384 EOS14315 AA535840 Hs.1622Q3 ESTs; Weakly similar to ailemaBvely spliced product udngexon13A[H.sap^^^ 11 
338716 EOS38647 CH2^7502FG_UNK_EM:AC005500.G£NSCAM488.9 

CH223&AC005500.GENSCAN.488.9 11 

330886 EOS30817 AA135608 Hs.189384 ESTs;Wea]dys!milartoIIllALUSUBFAMILY J WARNING ENTRY till [Ksaplens] 11 
80 327331 EOS27262 cjjisgil5867516|refign 4 - 55606 55737 ex 2 6 CDS 7.01 132 2349 

Ca01Jisgil5867516 11 
326714 EOS26645 c20Jis gi|5867595|i8f|gn 2 +124490 124588 ex 5 6 CDSI 0.11791020 

CH.20Jisg45667595 1.1 

316734 EOS16665 AW080237 Ks.252884 ESTs 11 

85 311660 EOS11591 AI978583 Hs.232161 ESTs 11 

312757 EOS12688 AI286970 Hs.183817 ESTs 11 
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331686 EOS31617 

337840 EOS37771 

332093 EOS32Q24 

5 319595 EOS19526 

315990 EOS15921 

322438 EOS22369 

332965 EOS328S6 

10 337182 EOS37113 

334948 EOS34879 



325864 EOS2S795 



15 
20 

25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



337760 
315422 



EOS37691 
EOS15353 



332961 EOS32892 



314703 
317791 
333880 

322419 
338124 

308884 
333349 

313150 



EOS14634 
EOS17722 
EOS33611 

EOS22350 
EOS38055 

EOS08815 
EOS33260 

EOS13081 
EO$39139 



335653 EOS35584 



319524 
301576 
317598 
333473 

333949 

339258 

332884 

314660 
333220 

308108 
320709 
307612 
330286 

304495 
310583 



337602 
307626 



EOS19455 
EOS01507 
EOS17529 
EOS33404 

EOS33880 

EOS39187 

EOS32815 

£0814591 
EOS33151 

EOS08037 
EOS20640 
EOS07543 
EOS30217 

EOS04426 
EOS10514 
EOS32827 

EOS37533 

EOS07557 
EOS34627 

EOS18583 
EOS37775 

EOS34754 

EOS33859 

EOS37434 
EOS22975 



318652 
337844 

334823 

333928 

337503 
323044 
329164 

335468 EOS35399 



323570 
333568 

331865 
336246 



EOS23S)1 
EOS33499 

EOS31798 
EOS36177 



W88502 Hs.182258 ESTs 
CH22.6223FGLJJNieEMAC00550aGENSCAN.26-9 

CH2LEAitAC005S0aGENSCANJ2&-9 
AAfi08794 Ks.112592 ESTs 
K81361 KS.19448S ESTs 
A1800041 H&190555 ESTs 
W44531 Hs.167851 ESTs 
CH22_189FG,50J_UNK_EIwfeAC000097.<^NSCAN.W 

CH22_FGENEa50J 
CH2i.5204FG.570JL CH2aj6ENES.570-2 
CH2?J269FGJ65J5_UNK_EMAC00550aGENSCAN.359-13 

CH2^6ENES.465J5 
c18Jis^67069|rBf|gn2- 110834 110S04ex33CDSf a76 71 457 



337238 EOS37169 



CH22.6110FG_UNK.EA/tAC000097.GENSCAN.11&8 

CH22LEiykAC000097.GENSCAN.11&« 
AW135357 Hs.192374 ESTs 
CH22J746FG_UNKJW32I10.G£NSCAN.7.1 

CH22„DJ32110.G£NSCAN.7.1 
CH22.185FG^48J8.UNK_EM:ACOC0097.GENSCANJi-14 

CH22JGENES.48J8 
AI791249 EST duster (notlnUniGene) 

A)8O1S0O Hs.128457 ESTs 
CH22^942FG_247.7.UNK.EMAC005500.GENSCAN.102.7 

CH2^fGENES.247.7 

AA248987 Hs.14084 ESTs; Highly slmflar to zinc RING finger protein SAG fM.muscuIus] 
CH22.6661FG_UNK-EWtAC00550aGENSCAN.196-2 

CH22_EM:AC005500.GENSCAN.196-2 
A1833131 Hs.179100 ESTs 
CH22_595FG_14DXUNK_EM:AC005500.GENSCANiO^ 

CH2^FGENEai40_3 
AA824410 Hs.165003 ESTs 
CH22UB146FG_L!NKJF113D11.GENSCAN.&3 

CH2LFF113D11.GENSCAN.6^ 
CH22_3013FG_590_4.UNK.EMAC005500.GENSCAN.4844 

CH2^FGENES590.4 
AA682865 Hs.194441 ESTs 

A1682905 Hs.146875 ESTs; Weakly similar to 1111 ALU SUBFAMILY J WARNING ENTRY IIH ^sapiens] 

AW206035 Hs.192123 ESTs 

CH2^_724FG_162_3_UNK_EM:AC005500.GENSCAN.42.10 

CH2^FGENES.16^3 
CH22L1225FG_303_5 UNK.EMAC005500.GENSCAN.162-9 

CH2^FGENE8.303.5 
CH22L.8207F6_LINK_BA354I12.GENSCAN.7.11 

CH22_BA354IUGENSCAN.7-11 
CH22_104FG_33_5JJNK.C20H1ZGENSCAN.22-7 

CH2aj6ENES.33J 
AA436007 Hs.188780 ESTs 
CH2^457F6 104J2_UNieEM:AC000097.6ENSCAN.108-11 

CH22_FGENES.104J2 
AI476803 EST singleton (not in UniGene) with exon htt 

AA456660 Hs.154165 ESTs 

AI290787 EST singleton (not in UniGene) with exon hit 

c^5_p2gii6671913Igb|Agn 2 - 3105031171 ex27CDSi a84 122791 

CR05_p2gi|6671913 
AA446446 EST singteton (not in UniGene) with axon hit 

AW205632 H$.211198 ESTs 
CH22.117FGJ5_10.UNieC20H1ZGENSCAN.24-9 

CH22_FGENES.35_10 
CH22_5895FG__UNK_C20H1ZGENSCAN.15-1 

CH22_C20H1ZGENSCAN.15-1 
AI300035 EST singieton (not in UniGene) m exon Nt 

CH22_2006FG 421 5 UNK-EM-AC005500.GENSCAN282-5 

CH22.FGENES.42L5 
T53259 EST duster (not In UniGene) 

CH22_6229FG__UNieEM:AC005500.GENSCAN.30-9 

ai22_EM:ACC05500.GENSCAN,30-9 
CH22J137FG 437 5 UNieEMAC005500.6ENSCAN.301. 7 

CH22_FGENES.437.5 
CH22_1201FG 299 2_UNK-EMAC005500.GENSCAN.158-5 

CH22JGENES^.2 
CH2?_5738FG_803J CH22JGENES.803.1 
AA148725 Hs.154190 ESTs 

cjOis gii5868691|ref| gn 1 * 62305 62517 ex 2 2 COS1 17.51 213 1868 

CHXJi3gil5868691 
CH2^19FG 587j4 UN)eEMAC005500.GENSCAN.454-12 

CH2Z.FGENES.567_4 
CH22_7838FQJUNieDJ32iiaGENSCAN.23^9 

CH22_DJ32IiaGENSCAN.23-39 
AU)38623 Hs.208752 ESTs; Weakly similar to III! ALU SUBFAMILY $X WARNING ENTRY IIU {H^aplens] 
CH22.826FGJ85_1_UNK.EM:AC005500.GENSCAN.64-1 

CH2?J^GENES.185J 
AA425756 Hs.98445 ESTs 
CH22_3844FG_746_5_UNieOA59H18.GENSCAN.4M 

CH22LFG£NES.746_5 
CH2Z.5343F6.64U. CH2?_FGENES.641^ 



1.1 

1.1 
1.1 
1.1 
1.1 
1.1 

1.1 
1.1 

1.1 

1.1 

1.1 
1.1 

1.1 

1.1 
1.1 
1.1 

1.1 
1.1 

1.1 
1.1 

1.1 
1.1 

1.1 

1.1 
1.1 
1.1 
1.1 

1.1 

1.1 

1.1 

1,1 
1.1 

1.1 
1.1 
1.1 
1.1 

1.1 
1.1 
1.1 

1.1 

1.1 
1.1 

1.1 
1.1 

1.1 

1.1 

1.1. 
1.1 
1.1 

1.1 

1.1 

1.1 
1.1 

1.1 
1.1 

1.1 
1.1 
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305089 EOS05020 AA642822 EST singleton (not in UnlGene) with exonhil 1.1 

300097 EOS00a28 Ar916973 Ks^3603 ESTs 1-1 

313134 EOS13Q65 ^83406 ESTs 1-1 

337452 EOS37383 ai22.5665FG^775J. CH2?JGB<Ea775-1 1 1 
325433 EOS25384 cDLhs 9!|586e936|f8qgn 4- 480706 480828 ex 3 4 COSi 1.99121818 

CK12jBffl5866938 1.1 
33SS99 EOS35930 CH22.3380FGL657J_UNK.ai246O7.GENSCAN.1M 

CH22_F6ENE&657_1 1.1 
333580 EOS33S11 CH22_840F6J99JLUNK.EM:AC005500.6ENSCAN.71-2 

CH22_FGENES.199Ji 1.1 

336836 EOS36767 CH22.4512FG.247_11. CH2^F(^S.247-11 1.1 
334677 EOS34608 CH22^1986FG_418_30_UNieEM:AC005500.GENSCAN.279^1 

CH2^/GEN£S.418.30 1.1 
329062 EOS28993 cjLhsgiI58^590|ref]gn3-5897759094ex411 CDSi-ai9l18627 

aiXJisgfl5868590 1.1 
333671 EOS33602 CH2^.932FG_245.5.UNK_EM:AC005500.GENSCAN.100-12 

CH22^FGENESi45^5 1.1 

304941 EOS04872 AA612612 EST singleton (not In UniGene) with exonM 1.1 * 

315772 EOS15703 AW515373 Hs.158893 ESTs 1.1 

301281 EOS01212 AA843986 Hs.190586 ESTs 1.1 
333520 EOS33451 CH22.777FG_174 3_UNK.EM:AC005500.GENSCAN.5a« 

CH2a.FGENES.174J 1.1 

315203 EOS15134 AI55g820 Hs.199438 ESTs 1.1 

31S927 EOS15858 AW025517 Hs.133250 ESTs 1.1 

317161 EOS17092 AA972165 Hs.150308 ESTs 1.1 
337692 EOS37623 CH22_602BFG_UNK,B&AC000097.GENSCAN.78-12 

CH22^EM:ACC00097.GENSCAN.78-12 11 
331472 EOS31403 N24830 yx70a0l3l Soares melanocyte 2NbHMI Homo sapiens cDNA done ll\i1AG&267050 3* similar to 

gb]Ma7912jHUMAU^E562 Human cardnoma ceMerived Alu RNA transcript (rRN^icontains AIu 

repetitive element;, mRNA sequence. 1.1 
336439 EOS36370 CH22_3859FG_827.4_UNieOJ579N16.GENSCAN.1-3 

CH22J=GENES.827_4 1.1 
326882 EOS26813 c20Jis gij6682509Iret|gn 2 -167988 168179 ex 4 4 CDSf 18.691922238 

CH.20Jisgij6682509 1.1 

336977 EOS36908 CH22_4793FG_380J_ CH22^FGENEa3B0-9 11 
333983 EOS33914 CH22_1260F6_310_7.UNK..EMAC005500.GENSCAN.167-5 

CH22_FGENES.310 7 1.1 
328878 EOS28809 c_7_hs gII6552423N gn 1 105580 105774 ex 6 7 COSI Z91 195 6195 

CH.07_hsgil6552423 1.1 

330415 EOS30346 083777 HsJ5137 KIAA0193genepioduct 1.1 

324824 EOS24755 AI826999 H5.224624 ESTs 1.1 
325815 EOS25746 cUJis gt6682483|ref|gn 1-129273 130754 ex 1 1 COSo 11.82 1482 2225 

Cai4_fisgil6682483 1.1 

300463 EOS00394 NS2S10 Hs.186470 ESTs 1.1 
335708 EOS35639 CH2i.3069FGJ99J.UNK.EM:AC0055aaGENSCAN.490-11 

CH22_FGENES.599_8 1.1 

324575 EOS24506 AW502257 EST cluster (not In UniGene) 1.1 

337951 EOS37B82 CH22»6391 FG_UNK_EM:AC005500.GENSCAN.94-1 

CH22.EM:AC005500.GENSCAN.94-1 1.1 
335935 EOS35866 CH22.3313FG_646XUNK.DJ24607.GENSCAN.1-5 

CH22_FGENES.646_6 1.1 
334914 EOS34845 CH2Z.2233FG_457_3J.INieEMAC005500.GENSCAN.346-2 

CH22.FGENES.457_3 11 

309527 EOS09458 AW150648 Hs.75621 protease inhli)ltor 1 (antl^lastase); alpha-1-antltrypsin 1.1 

318901 EOS1B832 AW368520 Hs.24639 ESTs ' 1.1 

320484 EOS20415 AA094436 Hs.155712 folDstatin-Kke 1 1.1 
333665 EOS33596 CH22_926FGJ44 1 UNK_EM:AC005500.GENSCAN.99-1 

CH22 FGENES.244_1 1.1 
335860 EOS35791 CH22_3235FG 629J UNieEM:AC005500.GENSCAN.519-4 

CH22.FGENES.629J 1.1 

313339 EOS13270 A!682536 Hs.163495 ESTs 1.1 

300149 EOS00080 AW448916 Ks.149018 ESTs 1.1 

318112 EOS18043 Ai028162 Hs.132307 ESTs 1.1 
337807 EOS37738 CH22_6178FG_UNK^:AC005500.GENSCAN.94 

CH22.EMACC05500.GENSCAN.94 1.1 

336917 EOS^848 CH22.4688FG_346_4 CH22_FGENES.3464 1.1 

337489 EOS37420 CH22_5722FG 799JL . CH22.FGENES.799-2 1.1 

320112 EOS20043 T92107 Hs.188489 ESTs 1.1 
332975 EOS32906 CH2i.199FG.51_10_UNieEJ*AC000097.GENSGAN.4-12 

CH22_FGENES.51 10 1.1 
327805 EOS27736 c_5Js g]|5867968Hgn 2 +19952 20019 ex 1 2 CDSf 9.4768988 

Ca05_hsgii5867988 1.1 
339215 EOS39146 CH22_8153FQ_JJNKJT113O11.GENSCAN.6.10 

CH22_FF113O11.GENSCAN,6.10 1.1 

311965 EOS11896 T69279 EST duster (not In UniGene) 1.1 

314043 EOS13974 AA827082 EST duster (not in UniGene) 1.1 

333447 EOS33378 CH22-697FG 154 5 UNieEMAC005500.GENSCAN.3S6 

CH22_FG£NES.154^ 1.1 
333242 EOS33173 CH22_481FG_111 6_UNK_EM:AC000097.6ENSCAN.12(W 

CH22_FGENES.111 6 1.1 
338596 EOS38527 CH25L7343FG_UNK-EMAC005500.GENSCAN.437-2 

CH2LEIM:AOO0550aGENSCAN.437.2 1.1 
329989 EOS29920 c16_p2g!|4567166|9b|Agn 2-^72861 73052 ex 1 3 COSf1&02 192590 

CH.16_p2gll4567166 1.1 

315675 EOS15606 AA652272 Hs.197320 ESTs • 1.1 

336722 EOS36653 CH22_4245FG_84.?. CH2aj=GENESi4-2 1.1 
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334220 EOS34151 a<22.1503FGJ59AUNK^EM:AaM550aGaJSO^ 

CH2ZJGENE&359 4 1*1 

336703 EOS36634 CH22_4201FG.56J. CH2?_R3ENE&5M 1.1 
- 336397 EOS36328 CH2^.3812FGJ23J^UNK.BA232E17.GENSCAM6-11 

5 (>i2iJGENE&823 12 1.1 

316105 EOS16036 AW2S5687 Hs^20 ESTs 1-1 
334661 EOS34S92 (>l22^1969FGj418j9.LINK.^IAC00550aG£NSCAN.279-13 

CH22J^GENEa418_9 ' 1.1 

307783 EOS07714 AI347274 EST singleton (not in UniGen8)wilh axon hit 1.1 

10 333997 EOS33928 CH2a.1275FG_310_2^.UNK.EM:AC005500.GENSCAN.1 67-21 

CH2?_FGENEa310.22 1.1 

331903 EOS31B34 AA436673 Hs.29417 Homo sapiens mR^^cD^iADKFZp586B0323(^ dons DKFZp586B^^^ 1.1 
328249 EOS28180 cXh$gi|6381891Ngfl2 - 9635296527ex23COSi ai91764550 

Ca08_hsglI6381891 1.1 
15 338251 EOS38182 CH22_6849FG_UNK_EK*AC00550aGENSCAN.27O'1 

CH22^EM:AC005500.GENSCAN,270.1 1.1 

323561 EOS23492 AA825426 Hs.238832 ESTs; WeaWy similar to!!!! ALU SUBFAMILY J WARNING E^^■RYl^IIRsaplens^ 1.1 

301464 EOS01395 AA991519 H5.253324 ESTs 1.1 
335916 EOS35847 CH22„3293FG 636 12^UNieEMJ^C00550aGENSCAN.526-12 
20 CH2^F6ENES.636_12 * 1.1 

321828 EOS21759 X56197 EST duster (not In UnlGene) 1.1 

327413 EOS27344 cJUis giia67750|ref|gn 3 + 10141 0101508 ex 4 5 CDSi 4.3499587 

CH.0Osgi|5867750 1.1 
334474 EOS34405 CH22^1773FG_394.5.UNK.EMAC00550aGENSCAN.257-5 

25 CH22.FGENES34J 1.1 

336739 EOS36670 CH22_4291FGJ17J_ CH22_FGENES. 117-3 1.1 

316517 EOS16448 AI784315 Hs.123163 ESTs 1.1 
325519 EOS25450 c12_hs gll6017036|refl gn 5 - 186804 186915 ex 1 3 COS! 8.36 112 2508 

CR12_hsgi|6017036 1.1 
30 333875 £0333806 CH22_1145FG_291JLUNK_EM:AC005500.6ENSCAN.149^ 

CH2^FGENES.291J1 1.1 
338221 EOS38152 CH22^6797FG_UNICEMAC005500.GENSCAN.246-10 

C>122.EM:ACOOS500.GENSCAN.248-10 1.1 

336878 EOS36809 CH22_4617FG 318J^ CH22_FGENES.31fr5 1.1 
35 337919 EOS37850 CH22L6338FG_UNK_EM:AC005500.GENSCAN.66-6 

CH22,EM:AC00550aGENSGAN.6&^ 1.1 

309828 EOS(^759 AW2939g9 ESTsinglelon(not]n UniGen8)withexonhit- 1.1 

305259 EOS05190 AA679225 EST singleton (not in UniGene) with exon hit 1.1 

333922 EOS33853 CH2Z.1195FG_296J3_UNieEM:AC005500.GENSCAN.155-l6 

40 CH22LFGENES.298J3 1.1 

322092 EOS22023 AF085833 EST cluster (not in UniGene) 1.1 

313356 E08132B7 A1266254 Hs.132929 ESTs 1.1 

318847 EOS1877B Z42908 H8.12308 ESTs 1.1 

337175 EOS371C8 CH22_5195FG_567J. CH22^FGENES.567-1 l.t 

45 336979 EOS36910 CH22_4802FGJ85X CH22„FGENES.3864 1.1 

312169 EOS12100 AI064824 Ks.193385 ESTs 1.1 
336198 EOS36129 CH22_3595FG_719JLUNK.DA59H18.GENSCAN.21-2 

CH22_FGENES.719^2 1.1 

321948 EOS21879 AA309612 Hs.118797 ubIquitin^onjugatlng6nzynri6E203(honK)!o90UstoyeastUBC4^ 1.1 

50 324692 EOS24623 AA557952 EST cluster (not In UniGene) 1.1 

330395 EOS30326 D10923 Hs.137555 putative chemoldne receptor; GTP-blnding protein 1.1 
333119 EOS33050 CH22L347FG_80_4_UNK-EIAAC000097.GENSCAN.65^ 

CH22JGENES.80.4 1.1 

316012 EOS15943 AA764950 Hs.119898 ESTs 1.1 

55 300142 EOS0C073 AI743419 Hs.205707 ESTs 1.1 

317215 EOS17146 AW014242 Hs.159998 ESTs 1.1 
329526 EOS29457 c10j)2 9il3983506Igb|U gn 2 + 12251 12325 ex 3 3 CDSI 7.37 75 178 

CH.10_p2gi|3983506 1.1 

317409 EOS17340 AA764968 Hs.4864 KIAA0B92 protein 1.1 
60 339230 EOS39161 CH22.8171FG_UNK.BA354l12.GENSCAN.1-6 

CH22_BA354l1ZGENSCAN.l-6 1.1 

311598 EOS11S29 AW023595 Hs.232048 ESTs 1.1 
339164 EOS39095 CH22.8091FG_UNiCDA59H18.GENSCAN.694 

CH22_DA59H18.GENSCAN.694 1.1 
65 326725 EOS26656 c20Jis gi|65524S6|ref|gn 2 - 223005 223125 ex 5 6 CDSi 6.10121 1038 

CH.20_.hsglI6552456 1.1 

330952 EOS30883 H02855 Hs.29567 ESTs 1.1 
334621 EOS34552 CH2^.192BFG 412j4_UNK.EMAC005500.GENSCAN.2754 

CH22_FGENES.41?.4 1.1 

70 301685 EOS01616 W67730 EST cluster (not in UniGene) willi exon hit 1.1 

308781 EOS08712 AI811707 EST singleton (not In UniGene) with exon hit 1.1 

323413 EOS23344 AA248828 H&225676 ESTs 11 

306723 EOS08654 AI026151 EST singleton (notin UnlGene) with exon hit 1.1 

331258 EOS31189 Z41777 Hs.27413 ESTs 1.1 

75 313028 EOS12959 AI355433 Hs.190856 ESTs 1.1 
333002 EOS32933 CH22J26F6_59_3_UNK_EM:AC000097.GENSCAN.21-3 

CH22_FGENES.59J 1.1 

303011 EOS02942 AF090405 EST duster (not biUitfGene) with exon htt 1.1 

317687 EOS17618 AA972990 Hs.l27904 ESTs 11 
80 328779 EOS28710 c.7Jsgi|58683Q9|r8f|gn4-*-4l57041639ex1 5CDSf Z65705365 

CH.07Jisgil5868309 11 
338707 EOS38638 CH22_7487FG_JUNK.EiybAO005S00.GENSCAN.482-2 

GH22_EMAC00S50aGENSCAN.462-2 11 
337974 EOS37905 CH22_6427FG_UNieEI*AC005500.(^SCAN.108^ 

85 CH22-.EMAC005500.GENSCAN.1064 11 

332854 EOS327e5 CH22L71F6J2.1.UN)eC20H1ZGENSCAN.1&.2 
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CH22JGENES^1 1.1 

311225 EOS11156 AW451982 H&248613 ESTs 1.1 

337094 EOS37025 CH2^.6018FGje5-19. CH22_FGENES.465.19 1.1 

319357 EOS19288 F1342S K&26229 ESTs 1.1 
332958 EOS32ffl9 CH2?.182R5J8J5.UN]eEMACX)00097.GENSCAN.2-11 

a{22JGENES.48JS 1.1 

309834 EOS09565 AW193825 ESTsingtefan(notlnUniG8ie)vnlhexonhll 1.1 

321171 EOS21102 A1769410 HsJ221461 ESTs 1.1 

316440 EOS16371 AI954795 Hs.1 56135 ESTs 1.1 

311665 EOS11598 AW^254 Hs.223742 ESTs 1.1 
327548 EOS27479 c.3Jlsgi|5867797M6n 2 -81067 81130 ex 3 7 COS 8.426412 

CH.03 hsgi)5667797 1.1 

314940 EOS14871 AW452768 K&182045 ESTs" 1.1 
326401 EOS26332 cl9JtsgiI5887355|re}Ignl -^35165353326x911 COS! a41 168788 

CH.19Jsg55867355 1.1 
336347 EOS36278 CH22L3759FG_815J_UNK^8A232E17.GENSCAN.1-24 

CH22_FGENEa815_3 1.1 

322297 EOS22228 W76548 Hs.136028 ESTs; Moderately similar to UIl ALU SUBFAMILY SC WARNING ENTRY iiniH^^^ 1.1 

309977 EOSasm AW451663 EST singleton (not In UniGene) with exon hit 1.1 

333466 EOS33397 CH2a.717FGJ61J2.UNK-EMAC00S500.GENSCAN.42.2 

CH22JGENES.16U 1.1 
329170 EO829101 cjUiS8l|5668693|ref|gn2+6792468019ex68COSI a30981882 

CH.XJisgI]5868693 1.1 
329479 EOS^IO c10j)2gi|3983526Igb|Agn3-74257561 ex 1 3CDSI 4.3313722 

Cai0j)2gi|3983526 1.1 
326666 EOS26599 c20_hsgil6552455|reflgn U 146726 146638 ex 11 11 CDSI 1.84 113767 

CH.20Jsg!16552455 1.1 

319354 EOS19295 H06538 Hs.12270 ESTs 1.1 

30^88 EOSQ2919 W23986 Hs.34578 aIpha2:3'S}a)yItransferase 1.1 
327687 EOS27618 c_4.hsgj|5867847|ref|gn 1 •169293189362ex23COSi^).2870782 

Ca04_hsgil58S7M7 1.1 
339413 EOS39344 CH22^8405FG„UNK-PJ579N16.GENSCAN.5^ 

CH2Z.CXI579N16.GENSCAN.&8 1.1 

306156 EOS06087 AA918274 Hs.76067 heat shock 27kO protein 1 1.1 

320858 EOS20789 D59968 EST cluster (not in UniGene) 1.1 

325447 EOS25378 c12Jis gP66941|ret| gn 3 > 372480 372621 ex 2 3 CDSI 9.16 142 1026 

CH.1^hsg[I5866941 1.1 

322695 EOS22627 AI084724 Hs,228468 ESTs 1.1 
329959 £0329890 c16_p2 giI5103B03Igb|A gn 3 1- 188050 188193 ex 8 8 COSI Z01 144 361 

CH.16j>2gI{5103803 1.1 

312628 EOS12559 AA632817 K5.190316 ESTs 1.1 
339305 EOS39236 CH22^8262FGJUNKJA354l1ZG£NSCAN.21-3 

CH2a.BA3S4l1ZGENSCAN.2t.3 1.1 

311829 EOS11760 m848S Hs.134549 ESTs 1.1 

303270 EOS03201 AL120518 Hs.l05352 ESTs 1.1 

321226 EOS21157 AA311443 Hs.251416 Homo sapiens mRNA; cDNA DKi=Zp586E2317 (from done DKFZp586E2317) 1.1 
335827 EOS35758 CH2Z.3200FG_620J_UNK^EM-AC005500.GENSCAN.512-1 

CH22.FGENES.620J 1.1 

336677 EOS36608 CH22L4155FG_43^5_ CH2a_FGENES.43.5 1.1 
330081 EOS30012 c19j>2gi|6015314|gb|Agn 1-5768 5835ex49COSi Z8868 162 

CH.19j)2gj|6015314 1.1 
339313 EOS39244 CH22.8272FG„UNK.BA354I12.6ENSCAN.22-11 

CH2?..PA354I12.GENSCAN.22-11 1.1 

319936 EOSig867 W22152 EST cluster (not in UniGene) 1.1 

332858 EOS32789 CH2a.76FG_24_LUNICC20H12.GENSCAN.16-6 

CH2^FGENES.24J 1.1 

315630 EOS15S61 AA648355 Hs.185155 ESTs; Weakly siniilar to echinodem)m(Cfotubul8-associatedpn>tein^^^ 1.1 
332995 EOS32926 CH2W19FG.58JLUNieE\kAC000097.GENSCAN.19.2 

CH22_FGENES.58_2 1.1 
333441 EOS33372 CH2:L691FGJ5U.UNiePA:AC005500.GENSCAN.32^ 

CH22_F6ENES.151 5 1.1 
333496 EOS33427 CH2?_748FG 168 6_UNlCENtAC005500,GENSCAN.47^ 

CH22„FGENES.168_6 1.1 
339188 EOS39119 CH2^8123FG_UNK_OA59H18.GENSCAN.72-16 

CH22LDA59H18.GENSCAN.72-16 1.1 

338981 EOS36912 CH22L4818FG_397_7_ CH22_FGENES.397-7 1.1 

312142 EOS12073 AW»8359 Hs.221069 ESTs 1.1 

315779 EOS15710 AW016736 Hs.211378 ESTs 1.1 

318596 EOS18527 A)470235 Hs.172698 EST 1.1 
335701 EOS35632 CH22.3062FG.599.1.UNlCEMACOa5500.GENSCAN,490.2 

CH22_F6ENES.599_1 1.1 

319395 EOS19326 AW062570 Hs.13809 ESTs 1.1 

304238 EOS04167 W93278 EST singtelDn (not in UniGene) with exon hit 1.1 

307264 EOS07195 AI202211 EST singleton (not in UniGene) with exon hit 1.1 

334066 EOS33997 CH22_1344FG_327_2LUNieEMAC0055CO.GENSCAN.181.23 

CH22LFGENES.327 21 1.1 
327042 EOS26973 c21_hs gl)6531965imf] gn 18 - 1380806 1381443 ex 1 5 CDS! 30.85 638 943 

CH.21JJS gl|6531965 1.1 
326025 EOS25956 c17Jsgil58671761refign 1 *7085470915ex68COSI-1.4662127 

Cai7J«8il5667176 1.1 
325609 EOS25540 c14Jis gi|5886996|ref] gn 28 • 981751 981849 ex 1 10 COSI 1.46 99 101 

Cai4JisgiI5866996 1.1 

319983 EOS19914 T81429 EST chister (not in UniGene) 1.1 

334298 EOS34229 CH2?.l589FG_372.4.UNILEM:AC00550aGENSCAN.232-5 

CH22_FGENES.37^4 1.1 

323203 EOS23134 AA203135 Hs.130186 ESTs 1.1 
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305700 EOS05631 AA815428 EST singleton (not In UniGenejwOhexon hit 1.1 

313304 EOS13235 At334078 Ha.1S2438 ESTs 1.1 

310716 EOS10647 At589618 Ks.192413 ESTs 1.1 
^049 EOS2SS80 c21Jsg!|6531965|re{|gn 24- 1924026 1924110 ex 2 6 CDS 9.43651012 

5 Ca21Ji8gi|6531965 1-1 

313749 EOS13680 AW450376 Ks.1308a3 ESTs 1.1 

307041 EOS06972 AI144243 EST singleton (not tnUniGenelwffliexonhtt 1.1 

322394 EOS22325 AF077208 EST duster (not in UniGene) 1.1 

326416 EOS28347 c19Jis9l5867382|reqgn3' 45283 453756x33CDSf &6593923 

10 Cai9Jisgil5887362 1.1 

333947 EOS33878 CH2^1221FG.303_LUNieEM:AC00550aGENSCAii162^ 

CH2^GENES^J 1.1 

324609 EO824540 AW299534 EST cluster (not in UnlGene) 1.1 

330057 EOS29986 c174>2gi|6478962igb|Agn3+ 75145 75287ex33COSl-2.56143150 

15 CH.17j)2fli|6478982 1.1 

337603 EOS37534 OH2^5698F6_UNK_C20H1ZGENSCAN.16-2 

CH2^.C20H1ZGENSCAN.16.2 1.1 
332913 EOS328W CH22_134ro_36_18_UNlCC20H12.GENSCAN^17 

0122J^GENEa36_18 1.1 

20 310026 EOS09957 T24895 Hs.100891 ESTs 1.1 
3301 53 EOS30084 c21_p2 gl|4325335igbIA gn 2 + 146951 147475 ex 2 2 CDSI 25.45 525 233 

CH.21_p2giI4325335 1.1 
334118 EOS34049 CH22J396F6_330J9^UNJeEM-i^C00550aGENSCAN.185-20 

CH22J^GENE&330J9 1.1 

25 324795 EOS24726 AI494481 Hs.141579 ESTs 1.1 

332530 EOS32461 M31682 H$.1735 inhibin; t)8ta B (activin AB beta polypeptide) 1.1 

332048 EOS31979 AA496019 Hs.201591 ESTs 1.1 
334532 EOS34463 CH2^1834F6J0a.13_UNfeEM:AC005500.G£NSCAN,266-13 

CH22_FGEi^S.402_13 1.1 
30 329762 EOS29693 c14_p2 gi|6048280Iemb| gn 3 + 127744 127878 ex 2 4 CDSil 1.66 135 1054 

CH.14j)2gi|6048280 1.1 
332909 EOS32840 CH2^130FG_36_13.UNieC20H1ZGENSCAN.28-10 

CH22_FGENES.36J3 1.1 

321253 EOS21184 AI699484 EST cfustsr (not tn UniGene) 1.1 

35 336572 EOS36503 CH22.4007FG_843J2^LINK^DJ579N16.GENSCAN.15-13 

CH2?_FGENES.fr43_12 1.1 
328768 EOS28699 c_7Jis gi|601 7031 |reflgn 5 - 223741 224238 ex 1 1 COSo 30.00 498 5285 

CH.07„hsgiI6017031 1.1 
334335 EOS34265 CH22_1627FGJ75J2LUNK.EMAC005500.GENSCAN.235.12 

40 CH22_FGENES.375_12 1,1 

334063 EOS33994 CH22_1341FGJ27J7 UNlCEfAAC005500.6ENSCAN.181-20 

CH2i^FGENES.327J7 1.1 
333011 EOS32942 CH22_235F6_61_3_UNK_EIWAC000097.GENSCAN.23^ 

CH22^FGENES.6U 11 

45 304677 EOS04608 AA548071 EST dngteton (not in UniGene) wititexon hit 1.1 

313948 EOS13879 AW452823 Hs.1d5268 ESTs 1.1 
334358 EOS34289 CH2^.1652FG.378J.UNK.EM:AC005500.GENSCAN.239-1 

CH22LJGENES.378J 1.1 
328479 EOS28410 c.7Ji8gq5868449|refign 1-331 560 ex 1 31 CDSi 18.51 2302100 

50 CH.07JisgiI5868449 1.1 

335813 EOS35744 CH2?.3185F6_618J_UNieEIWAC005500.GENSCAN.510.1 

CH22^GENES.618.1 1.1 

312430 EOS12361 AW139117 Hs.117494 ESTs 1.1 

324783 EOS24714 AA640770 EST duster (not in UnlGene) 1.1 

55 337776 EOS37707 CH22.6132FG_UNieEM:AC000097.GENSCAN.119.18 

CH22.EM:AC000097.6Ei^SCAN.119.18 1.1 
327205 EOS27135 cjjis gll58674471ref|gn 6 + 16733S 167576 ex 99 COSl 15.50 242 259 

CH.01Jsgi|5867447 1.1 

315198 EOS15129 AI741508 Hs.186753 ESTs; Weakly similar to ll!l ALU SUBFAMILY J WARNING ENTRY ill! [H.saplensl 1.1 
60 336135 EOS38086 CH2?_3S25FG_704XUNK_DA59H18.GENSCAN.9^ 

CH2aj=GENEa704J 1.1 

316558 EOS18489 AW402677 Hs.90372 ESTs 1.1 
328162 EOS28083 c.6Jsgi|5868080Iref1gn 1 -73981 74203 ex 1 8 CDSI 31.69 223 3411 

Ca08_hsgil5868080 1.1 
65 330211 EOS30142 CL5j)2gi|6013592|gb|Agnl +59156 59215 ex 2 4 CDS! 4.2058184 

CH.054)2gpi3592 1.1 
339280 EOS39211 CH2^.8234FG_UNieBA3MI12.G£NSCAN.14.12 

CH22LBA354I12GENSCAN.14-12 1.1 

332045 EOS31976 AA491253 Hs.155045 bromodomalnadiacent to zinc finger domain: 2A 1*1 

70 313597 EOS13528 AW162263 Hs.249990 ESTs 1.1 
329503 £0829434 c10j£gI|3983517|gb|Ugn 2- 1801 19376x14008! 4.33137101 

CH.10_p2gi|3983517 1.1 
333488 EOS33419 CH22L740FGJ67_3_UN1CEM:AC005500.GENSCAN.46.10 

CH22J^GENES.167_3 1.1 

75 311960 EOS11891 AW440133 Ks.189690 ESTs 1.1 

320590 EOS20521 U67058 Hs.168102 Human proteinase activated receplor-2 mRNA; SIITR 1.1 
334047 EOS33978 CH22J 325FG_326^5_UNieEMJ\C005500.GENSCAN.175^ 

CH22J^GENES.326^5 1.1 

304782 EOS04713 AA582Q81 EST singleton (not in UnlGene) with exon hit . • 1.1 

80 324231 EOS24162 W60827 EST duster (not in UnlGene) 1.1 

327212 EOS27143 cjjisgl|5e67463|reflgn 1-42308 42424 ex 5130081 6.58117325 

CH.01Jisgi|5867463 1.1 
335657 EOS3S788 CH22.3232FC.629J.UN)CEM:AC00550aGENSCAN.519-1 

CH22_FGENESi29 1 11 

85 317775 EOS17706 AA974603 Hs.l81123 ESTs ~ • 1.1 

331053 EOS30984 N70242 Hs.183146 ESTs 1.1 



97 



wo 02/21996 



PCT/USOl/28716 



335940 EOS35871 CH22^3318R3.646J3_UNK.OJ248D7.GBISCAN.1-t2 

CH22LFGBC&64CL13 11 

32^ EOS22499 W87342 Hs.203652 ESTs 1.1 

314091 EOS14Q22 A1^12 Hs.133540 ESTs 1-1 

5 313570 E0S13SG1 AA041455 Hs.203312 ESTs 1-1 

300987 EOS008S8 AA565209 Hs.19Q216 ESTs 1*1 

314544 EOS14475 AA399018 Ks.250835 ESTs 1-1 
328321 EOS2B252 c.7Jis gl|586d373|reqgn 7 -1029614 1029673 ex 1 3 COSI -2.40 60 448 

Ca07Jsgip8M373 1.1 

10 310979 EOS10910 AW445166 Ks,17CaQ2 ESTs 1.1 

310730 EOS10661 AI939421 Hs.160900 ESTs 1.1 

318471 EOS18402 AW137725 Hs.146874 ESTs 1.1 

315533 E0815464 AW206191 K5.152774 ESTs 1.1 
325751 EOS25682 cMJs gi|6682474Ir6fign 4 130437 1305») ex 6 7 CDS! 0.22841666 

15 CH.14.hsgiI6682474 1.1 

318780 EOS18711 R90906 Hs.113307 ESTs 1.1 

313271 EOS13202 AW444819 Hs.144851 ESTs; WeaMy simQar to C09F5.2 (aetegans) 1.1 

304546 EOS04477 AA486074 EST singleton (nolin IWGene) with axon hit 1.1 

^ 330618 EOS30549 X55990 Hs.73839 ribonucleaseiRNase A family; 3 (eosinophil cat)(mtepra^ 1.1 

20 332931 EOS32862 CH2Z.152FG.38J.UNICC20H12.GENSCAN.2&5 

CH22.FGENES.38_5 1.1 
336602 EOS38533 CH22.4047FG.37^4.UNieEM:AC0C550aCSENSCAN.232.4 

CH22JGENES.37^.4 1.1 

311185 E0S11116 AI638294 Hs.224665 ESTs 11 

25 337585 EOS37516 CH22.5873FG_UNK.C20H1ZGENSCAN.5^ 

CH2i.C20H1ZGENSCAN.5.3 . 1.1 

310249 EOS10180 AW071751 Hs.l3179 ESTs; Moderately sindlar to!!!! ALU SUBFAMILY SQWARNirWE 1.1 

314578 EOS14509 AA410183 Hs.l37475 ESTs 1.1 

310750 EOS10681 AI373163 Hs.l70333 ESTs 1.1 

30 333968 EOS33899 CH22 1245FG_307^4 UNK_EM:AC005500.GENSCAN.165.5 

CH22_FGENES^_4 1.1 

316133 EOS16064 AI187742 H5.125562 ESTs 1.1 

308337 EOS08268 AI608947 EST singleton (not in UniGene) with exonliit 1.1 

326160 EOS26091 c17Ji8gi|5867254{req gn 6- 112000 112137 ex2 4 CDS 8.01 1381952 

35 CH.17_hsgIj5867254 1.1 

338023 EOS35954 CH2?.3406FG_669J2JLINieDJ32I10.GENSCAN.9-17 

CH22.FGENES.659J2 1.1 

323479 EOS23410 AA278246 EST cluster (not in UnlGene) 1.1 

336090 EOS36021 CH22_3477FG 689 2_UNieDJ32llO.GENSCAN.23.20 

40 CH22_FGENES.689.2 t1 

311192 EOS11123 AW237220 Hs.211130 ESTs 11 
335081 EOS35012 CH22J409FG.488_4.UNK_EM:AC005500.GENSCAN.3846 

CH22^FGENES.488J 1.1 

309519 EOS09450 AW148940 Hs.248647 EST 1.1 

45 321172 EOS21103 H49160 Hs.133472 ESTs 1.1 

301976 EOS01907 T97905 EST cluster (not In UnlGene) wit!) exontiit 1.1 

323012 EOS22943 AI832201 Hs.211469 ESTs 1.1 

319528 EOS19459 R08673 Hs.177514 ESTs 1.1 
329838 EOS29769 c144>2gi|6672082|embign 2 -^33990 34098 ex 3 4 CDS! 9.11 1092222 

50 CH.14_p2gi!6672062 1.1 

302623 EOS02554 AB019571 EST cluster (not in UnlGene) with exon lilt 1.1 

334433 EOS34364 CH22.1731FGJ85J^UNK.EMAC005500.6ENSCAN.249^ 

CH22_FGENES.385_8 1.1 

304747 EOS04678 AA577816 EST singleton (not in UniGene) with exon hit 1.1 

55 333270 EOS33201 CH2^„513FGJ21_1_UNK3tAC005500.6ENSCAN.4.11 

CH22_FGENES.121J 11 
307054 EOS06985 AI148181 Hs.176835 EST ^ 1.1 

320764 EOS20695 R73070 Hs.246927 ESTs 1.1 

321523 EOS21454 H78472 Hs.191325 ESTs; WeaWysiniflar to cDNA EST yk414c9.3 comes from this gene IC.eIegans] 1.1 

60 322114 EOS22045 AA6W791 Hs.191740 ESTs 11 

303582 EOS03513 AA377444 EST cluster (not in UniGene) with exon hit 1.1 

322924 EOS22855 AA669253 Hs.193971 ESTs 11 

311179 EOS11110 AI880843 Hs.223333 ESTs 1.1 

318601 EOS18532 T39921 EST cluster (not in UniGene) 1.1 

65 309791 EOS(»722 AW276176 Hs.73742 ritiosoma! protein; large; PO 1.1 
333882 EOS33813 CH22J153FG_292_4_UNK_EM:AC0D5500.GENSCAN.15(M 

C^22J=GENES.292_4 1.1 
337645 EOS37576 CH22_5960FG__UN1CEM:AC000097.GENSCAN.1(W 

CH22_EI\ftAC000097.GENSCAN.l(W 1.1 

70 335623 EOS36554 CH22_2983FG_584_2_UNieEM:AC005500.GENSCAN.478'2 

CH22J^GENES.584J 1.1 

314745 EOS14676 AA564489 Hs.137526 ESTs 1.1 

330790 EOS30721 T48536 Hs.105807 ESTs * 1.1 

332)71 EOS32002 AA598594 Hs.l 12475 ESTs 1.1 

75 312005 EOS11936 T78450 Hs.l3941 ESTs 1.1 

330894 EOS30625 AA019806 Hs.108447 spinocerelieilar ataxia 7 (olivopontocerebellar atrophy wilhreti^ 1.1 

330739 EOS30870 AA293477 Hs.227591 ESTs 1.1 

303042 EOS02973 AF129532 EST duster (not to IMSena) with exon hfl 1.1 

323091 EOS23022 AW014094 Hs.210761 ESTs 1.1 

80 3288^ EO^SI cJJi5gi|5868330|reflgnU 90446 90602 ex 3 4 CDSi 10.20 157 5634 

CH.07Jsgii5868330 1.1 

300472 EOS00403 T90822 Hs.82609 hydroxymethylbilana synthase 1.1 

310645 EOS10576 AI420742 Hs.l63502 ESTs 1.1 

332238 EOS32169 N53480 Hs.l 08622 ESTs 1.1 

85 30(B66 EOS00897 AA564740 Hs.258401 ESTs 1.1 

330437 EOS30368 HG27304m827 Rbifnogen.AAlphal^>|ypeptide,A!LSpDc8 2.E 1.1 
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302292 EOS02223 fmrw EST cluster [not in UniGen8)wihexonlnl i1 

330138 EOS30069 G21jv2gq42t0430Isntb|gn1 -22334 22460 6x33 CDSf1&56 127 105 

Ca21_p2gi|4210430 1.1 

332952 E0832883 CH2^176F6.48^8JJNK.EIftACQ00097.GENSCAN^ 

5 CHZLFGENES.48.8 t.1 

31^01 EOS19832 T77138 H18765 RNAheTtcase-retated protein 1.1 

321166 £(mm AA411263 Ks.128783 ESTs 1-1 

336227 EOS36158 CH22.3625FG 730JLUNK.DA59HiaGENSCAN.3S.2 

, _ CH22»FGENES.730_2 1.1 

10 302332 EOS02263 Ai833168 H5.184507 Homo sapiens Chromosome 16 BAG done Crr987SK-M28A3 1.1 

313800 EOS13731 AW298132 Hs.166674 ESTs 1.1 

33d356 EOS3d287 CH22.8326FG.UNK.3A354I1ZGENSCAN.31-1 

CH223A354112.GENSCAR3M 1.1 

324512 EOS24443 AW502125 EST cluster (not in UniGene) 1.1 

15 319235 EOS19166 F11330 Hs.177633 ESTs 1.1 

320352 EO820283 Y13323 KS.1452SS i&integrih protease 1.1 

338316 EOS38247 CH22_6944FG_UNieEM:AC005500.GENSCAN.304-2 

CH22_EMAC005500.GENSCAN.304-2 1,1 

333964 EOS33895 CH22.1 241 FG_305^NK_EM:AC00550aGENSGAN.1 64.2 

20 CH22-FGENE&305J 1.1 

312758 EOS12689 AA721107 Hs^2604 ESTs ' 1.1 

338178 EOS381G9 CH23L6726FG_UNK.EMtACXMJ5500.G£NSCAN.219^ 

CH22.EM:AC00550aGENSCAN.219^ 1.1 

315199 EOS15130 AA877998 Hs.125376 ESTs 1.1 

25 312321 EOS122S2 R66210 Hs.186937 ESTs 1.1 

338765 EOS38696 CH22^7588FG_UNK_EMAC005500.GENSCAN.518.1 

CH2a.EMAC005500.GENSCAa518-1 1.1 

330547 EOS30478 U32983 Hs.183S71 tryptophan 2;3KliQxygenase 1.1 

315368 EOS15299 AW291563 Hs.152495 ESTs 1.1 
30 328691 EOS28622 c_7_hsgi|6588001|ref|gn 7 - 579598 5796646x23 008112.78 67 4326 

CaO7JisgiI6588001 1.1 

329179 EOS29110 cjLhsgi|5868704irei|gn 2'*' 181639 181815 ex 3 4 CDS! 0.32177 1939 

CaXJis 9^5868704 1.1 

327072 EOS27003 c21JisgiI653t965Ireqgn 55 - 3796429 3797197 ex 44 COSf a33 759 1270 

35 CH.21J15 ^16531965 1.1 

312056 EOS11987 T83748 Hs.1 89712 ESTs 1.1 

339128 EOS39059 CH22 8046FG_UNK_DA59H18.GENSCAN.55-2 

CH22LDA59H18.GENSCAN.55.2 t.l 

307646 EOS07577 A1302236 EST singleton (not in UniGene) with exon hit 1.1 

40 319198 EOS19129 F07354 EST cluster (not In UniGene) 11 

338556 EOS38487 CH2?J283FG_UNICEI\AAC0a55C0.GENSCAN.417-8 

CH22.EM:AC005500.GENSCAN.417-8 t1 

308143 EOS06074 AA916314 EST singleton (not in Ur^Gene) with exon hit 1.1 

332384 EOS32315 M11433 Hs.101850 rsfinoMnndlng protein 1; ceUular 11 

45 325100 EOS25031 T10265 Hs.1 16122 ESTs: Weakly similar to coded for by C.eleganscDNAyk30t)3.5[aetegans] 1.1 

309839 EOS09770 AW296076 EST singleton (not in UniGene) with exon hit 1.1 

312180 E0S12111 AI248285 Hs.118348 ESTs 1.1 

330385 EOS3031 6 AA449749 Hs.31 386 ESTs; Highly similar to secreted apoptosis related protein 1 [H-saplens] 1.1 

315882 EOS15813 AI831297 Hs.123310 ESTs 1.1 
50 325843 EOS25774 c16Jisgi|65524531ref|gn 1-7128 7232 ex 1 3003 1.87107182 

CH.16.hsgi|6552453 1.1 

330783 EOS30714 D60050 Hs.34812 ESTs 1.1 

317224 EOS17155 OS6760 Hs.8122 ESTs 1.1 

316042 EOS15973 AW297979 Hs.170898 ESTs 1.1 
55 333524 EOS33455 OH22.781FG_175_10.UNK_EMAC005500.GENSCAN.53-15 

CH22„F6ENES.175J0 1.1 

302357 EOS02288 X03178 Hs.198246 group<speciSc component (vitamin D binding protein) 1.1 

309830 EOS0g761 AW294725 EST singleton (not in UniGene) with exon hit 1.1 

321489 EOS21420 AW392474 Hs.172759 ESTs; Moderately similar tallll ALU SUBFAMILY SQ WARNING ENTRY 1111 {H.saplen^ 1.1 

60 312304 EOS12235 AA491949 Hs.1833^ ESTs 1.1 

322026 EOS21957 AA233527 Hs.213289 low density lipoprotein receptor (famrtalhyperoholesterolemia) 1.1 
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Table lA shows the accession numbers for those primekeys in Table 1 which lack a 
unigenelD. Listed for each probeset is the gene cluster (CAT) number ftom which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



15 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT number 



Accession 



20 



25 



30 



35 



40 



55 



60 



300611 
301187 
301254 
301266 
301454 
301615 
301661 



301685 
301804 



301976 
302245 



302292 
302476 



302623 
45 302626 



50 



337193J 

434081J 

463589J 

468223J 

534162J 

5613_2 

7974J 



326972^1 
61J 



128835J 
9482.1 



27735.1 
3193Z.3 

9705J 
10441J 
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AW93d568 AA251701 BE162320 AW938597 

AA2874Q6 AA261844 AA261845 AA287355 AA810895 

AA278246 AW292815 AA278703 

AW247696 BE265140 AW403615 ALi)37647 AA312338 

AL041844 ALQ40002 AL039950 

AA323414 AW664013 AI809377 Af276041 AW296883 A1798340 
AK0Q2161 AA3271Q2 A1056888 AI743901 AI139018 AI199114 AI076003 
AL042005 AL042006 AA911481 
AA347566AA346521AI111169 

AA378739 AW964174 AA570564 A1076833 AW265063 AW006805 AA480656 AW004789 

W60827AU)79968AL047234 

AA464018 AA464079 AA468142 

AA464510 AA631257 A1740516 AI739132 AW972467 AI741376 AW088935 AI467852 AI752240 AI123717 Ai754551 
AW205510 AW044211 AW028889 AW198033 AI538632 AA513096 
AW500954 AW501111 AW501394 
AW502122 AW502125 AW501 663 AW501720 
AW502257 AI014241 AA100360 BE298534 

AW299534 AW299896 AA504765 AA505099 AA505100 AA584753 AW136415 AA768306 

BE397649 H14413 BE397689 BE514098 H53372 AA448021 R57944 AI307272 BE259369 H72331 BE251092 T27364 

AA001666 AA044433 AA875998 AW075405 AW338356 AA001667 AW300173 AW514944 AW468914 AA604673 AA702749 

AA805550 AA447621 AA934104 AI373527 AA604794 A191 1203 AI500844 AI291383 AA731 133 BE350633 AA044604 H95689 

H14366 AV660983 AA912893 AI369587 AI382271 AA917508 AW138391 BE622560 

AI376331 AI819150 AI097038 AI351100 AW504689 

AW503713 AA352950 AA044972 BE618246 AA335047AW962269 

AA557952 AA677593 AA618150 

Ai739168 AA426249 Ai199636 AW505198 AW977^1 AA824583 AA883419 AA724079 AI015524 AI377728 AW293682 

AI928140 AA731438 AI092404 AI085630 AA731340 

T85872T48305 

AA640770 AI6831 12 AA913009 

AA602539 D59262 AI684171 N46711 AW021857 D19768 

AA613792 AW182329 T05304 AW858385 

AK001379 AK001411 AW795711 T05997 AA287540 AA354538 AW957773 A1632268 AI651003 AI689650 AI809332 

AW304483 AI805269 AA278506 AA862381 AA287875 AW628545 AIQ85761 AW025965 AI658615 AW628879 AW139496 

AI214278 AA902745 AA991679 BE540102 AW593558 AI745602 AA744687 AI285441 AA807089 A1218314 AA721449 

A1202987 AA432129 AI285502 A1281462 AA731319 BE082573 

H09693H09699T09229 

T19142 AI351168T52843 BE241963 
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TABLE IB 

Table IB shows the genomic positioning for those primekeys in Table 1 that lack unigene 
ID'S and accession numbers. For each predicted exon, the genomic sequence souzx:e used for 
prediction is listed. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique number conresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbeis in th^ column are Genbank Identifier (G!) numbers. 'Dunham I. et aL' refers to the pubfication entitled 

The DNA sequence of human diromosoma 22." Dunham I. et al, Nature (1999) 402:489^95. 
Strand: Indicates DNA strand from which exons were predicted. 
Ntposrtion: Indicates nucleotide posifions of predicted exons. 



Pkey Ref 

3327^ Dunham, 1. etal. 

332908 Dunhaixv I- etal. 

332909 Dunham, I. eta]. 
332913 Dunham, I. etal. 
332952 Dunham, I. etal. 
3329S8 Dunham, I. etal. 
332961 Dunham, I. etal. 
332975 Dunham, I. etal. 
332991 Dunham, I. etal. 
333119 Dunham, I. etal. 
333131 Dunham, I. etal. 
333139 Dunham, L etal. 
333156 Dunham, L etal. 
333222 Dunham, I. etal. 
333254 Dunham, I. etal. 

333348 Dunham, L etal. 

333349 Dunham, I. etal. 
333366 Dunham, I. etal. 

333384 Dunham, I. etal. 

333385 Dunham, I. etal. 
333391 Dunham, I. etal. 
333488 Dunham, L etal. 
333520 Dunham, I. etal. 
333524 Dunham, L etal. 
333532 Dunham, I. etal. 
333580 Dunham, I. etal. 
333585 Dunham, I. etal. 
333597 Dunham, L etal. 
333619 Dunham, I. etal. 
333671 Dunham, I. etal. 
333680 Dunham, I. etal. 
333682 Dunham, I. etal. 

333763 Dunham, I. etal. 

333764 Dunham, I. etal. 

333769 Dunham, I. etal. 

333770 Dunham, I. etal. 
333849 Dunham, 1. etal. 
333875 Dunham, I. etal. 
333882 Dunham, I. etal. 
333922 Dunham, 1. etal. 
333928 Dunham, I. etal. 
333947 Dunham, I. etal. 
333949 Dunham, I. etal. 
333968 Dunham, I. etal. 
333983 Dunham, I. etal. 
333995 Dunham, I. etal. 
333997 Dunham, I. etal. 
334003 Dunham, I. etal. 
334012 Dunham,!, etal. 
334047 Dunham, I. etal. 
334063 Dunham, I. etal. 
334066 Dunham, I. etal. 
334078 Dunham, I. etal. 
334118 Dunham, I. etal. 



Strand 


Ntj>ositioii 


Plus 


73381-73768 


Plus 


1934283-1934366 


Plus 


1946582-1946735 


Plus 


1963539-1963843 


Plus 


2472864-2473012 


Plus 


2516164-2516310 


Plus 


2521424-2521555 


Plus 


2599641-2599702 


Plus 


2686938-2687372 


Plus 


3288316-3288640 


Plus 


3350064-3350170 


Plus 


3369495-3369571 


Plus 


3617584-3617790 


Plus 


3979706-3979803 


Plus 


2521424-2521555 


Plus 


4711908^712181 


Plus 


4713940-4714084 


Plus 


4798273-4798469 


Plus 


4907535-4907610 


Plus 


4907928-4908032 


Plus 


4916697-4916780 


Plus 


5396233-5396310 


Plus 


5586133-5586296 


Plus 


5612620-5612780 


Plus 


5622804-5622937 


Plus 


6142935-6143145 


Plus 


6234778-6234894 


Plus 


6331421-6331536 


Plus 


6562799-6562926 


Plus 


7038849-7039193 


Plus 


7071730-7071794 


Plus 


7076641-7076760 


Plus 


7692491-7692630 


Plus 


7693573-7693716 


Plus 


7696625-7696707 


Plus 


7700384-7700476 


Plus 


8018323-8018472 


Plus 


8135505-8136179 


Plus 


8153002-8153169 


Plus 


8381385-8381444 


Plus 


8468844-8469015 


Plus 


8579888-8579966 


Plus 


8589634-8589791 


Plus 


8681004-8681241 


Plus 


8813593-8813668 


Plus 


8855296-8855424 


Plus 


8866668-8867255 


Plus 


8892882-8892970 


Plus 


9007456-9010221 


Plus 


9428152-9428211 


Plus 


9731991-9732085 


Plus 


9739568-9739680 


Plus 


9809783-9809863 


Plus 


10344273-10344384 
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334122 Dunham, I. etal. 

334150 Dunham, 1. etal. 

334220 Dunham, I. etal. 

334298 Dunham, I. etal. 

5 334324 Dunham. L etal. 

334335 Dunham,!, etal. 

334433 Dunham, I. etal. 

334532 Dunham, I. etal. 

334561 Dunham, I. etal. 

1 0 334616 Dunham, I. etal. 

334628 Dunham, I. etal. 

334630 Dunham, I. etal. 

334631 Dunham, I. etal. 
334661 Dunham, I. etal. 

1 5 334677 Dunham, I. etal. 

334696 Dunham, I. etal. 

334714 Dunham, I. etal. 

334718 Dunhani, I. etal. 

334720 Dunham, 1. etal. 

20 334727 Dunham,!, etal. 

334739 Dunham, I. etal. 

334740 Dunham, I. etal. 
334769 Dunham, I. etal. 
334872 Dunham, 1. etal. 

25 334876 Dunham, I. etal. 

334883 Dunham, I. etal. 

334891 Dunham, I etal. 

334900 Dunham, I. etal. 

334902 Dunham, I. etal. 

30 334914 Dunham, L etal. 

334916 Dunham, I. etal. 

335044 Dunham, I. etal. 

335081 Dunham, I. etal. 

335158 Dunham, I. etal. 

35 335164 Dunham, I. etal. 

335166 Dunham, I. etal. 

335170 Dunham, I. etal. 

335188 Dunham, I. etal. 

335189 Dunham, I. etal. 
40 335200 Dunham, I. etal. 

335211 Dunham, I. etal. 

335219 Dunham, I. etal. 

335221 Dunham, I. etal. 

335225 Dunham, I. etal. 

45 335255 Dunham, I. etal. 

335287 Dunham, I. etal. 

335361 Dunham, I. etal. 

335364 Dunham, I. etal. 

335369 Dunham, I. etal. 

50 335468 Dunham, I. etal. 

335481 Dunham, I. etal. 

335488 Dunham, I. etal. 

335496 Dunham, I. etal. 

335497 Dunhani,!. etal. 
55 335499 Dunham, I. etal. 

335504 Dunhani, I. etal. 

335599 Dunham, I. etal. 

335623 IDunham,I.etal. 

335653 Dunham, I. etal. 

60 335687 Dunham, I. etal. 

335690 I^nham,I.etal. 

335692 Dunham, I. etal. 

335697 Dunham, I. eUl. 

335701 Dunham, I. etal. 

65 335708 Dunham,!, etal. 

335739 Dunham, I. etal. 

335742 Dunham, I. etal. 

336003 Dunham, I. etal. 

336015 Dunham, I. etal. 

70 336016 Dunham, I. etal. 

336018 Dunham, I. etal. 

336019 Dunham, I. etal, 

336020 Dunham, I. etal. 

336021 Dunham, I. etal. 
75 336023 Dunham, I. etal. 



Plus 


10411792-10411901 


Plus 


10529221-10529854 


Plus 


12718720-12718857 


Plus 


13424763-13425914 


Plus 


13539210-13539323 


Plus 


13608488-13608705 


Plus 


1427326M4273429 


Plus 


14792798-14792901 


Plus 


14987299-14987447 


Plus 


15176123-15176470 


Plus 


15310346-15310415 


Plus 


15322614-15322744 


Plus 


15325949-15326116 


Plus 


15477716-15477786 


Plus 


15517449-15517560 


Plus 


15665919-15666002 


Plus 


15760702-15760767 


Plus 


15775491-15775599 


Plus 


15792931-15793085 


Plus 


15942616-15942750 


Plus 


16004120-16004225 


Plus 


16009324-16009547 


Plus 


16170704-16170876 


Plus 


19162417-19162565 


Plus 


19185336-19185400 


Plus 


19223107-19223253 


Plus 


19299770-19299944 


Plus 


19315678-19315743 


Plus 


19317083-19317195 


Plus 


19495158-19495275 


Plus 


19572924-19573846 


Plus 


20842088-20842682 


Plus 


21113871-21113937 


Plus 


21569610-21569666 


Plus 


21585912-21586014 


Plus 


21587100-21587213 


Plus 


21623383-21623967 


Plus 


21669118-21669328 


Plus 


21673403-21673472 


Plus 


21743499-21743881 


Plus 


21774611-21774680 


Plus 


21875591-21875688 


Plus 


21882840-21882968 


Plus 


21890315-21890448 


Plus 


22032258-22032661 


Plus 


22299047-22299299 


Plus 


22807292-22807445 


Plus 


22833430-22833586 


Plus 


22843392-22843506 


Plus 


23787245-23787367 


Plus 


24082522-24084870 


Plus 


24118744-24118839 


Plus 


24164386-24164545 


Plus 


24167666-24167869 


Plus 


24176698-24176869 


Plus 


24182110-24182199 


Plus 


25043628-25043775 


Plus 


25138489-25138547 


Plus 


25329710-25329802 


Plus 


25445952-25446064 


Plus 


25455442-25455625 


Plus 


25468557-25468725 


Plus 


25481456-25481649 


Plus 


25513366-25513807 


Plus 


25541777-25541907 


Plus 


25698550-25698826 


Plus 


25712654-25712771 


Plus 


28406289-28406759 


Plus 


28640586-28640673 


Plus 


28646816-28646947 


Plus 


28660880-28660978 


Plus 


28663992-28664102 


Plus 


28683778-28683851 


Plus 


28686482-28686559 


Plus 


28698240-28698343 
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336071 Dunham, I. etal. Plus 

336090 Dunham, L etal. Plus 

336107 Dunham, I. etaJ. Plus 

336121 Dunham, I. etal- Plus 

5 336124 Dunham, I. etal. Plus 

3361 32 Dunham, I. etal. Plus 

336135 Dunham, I. etal. Plus 

336194 Dunham, I. etal. Plus 

336235 Dunham, L etal. Plus 

10 336367 Dunham, I. etal. Plus 

336379 Dunham, 1. etal. Plus 

336439 Dunham, I. etal. Plus 

336502 Dunham, I. etal. Plus 

336572 Dunham, 1. etal. Plus 

15 336602 Dunham, L etal. Plus 

336721 Dunham, I. etal. Plus 

336739 Dunham, I. etal. Plus 

336766 Dunham, 1. etal. Plus 

336833 Dunham, I. etal. Plus 

20 336836 Dunham, I. etal. Plus 

336878 Dunham, I. etal. Plus 

336880 Dunham, 1. etal. Plus 

336902 Dunham, I. etal. Plus 

336917 Dunham, I. etal. Plus 

25 336919 Dunham, I. etal. Plus 

336924 Dunham, I. etal. Plus 

336946 Dunham, I. etal. Plus 

336953 Dunham, I. etal. Plus 

336979 Dunham, I. etal Plus 

30 337169 Dunham, I. etal. Plus 

337175 Dunham, I. etal. Plus 

337 1 82 Dunham, I. etal. Plus 

337238 Dunham, I. etal. Plus 

337303 Dunham, 1. etal. Plus 

35 337489 Dunham, 1. etal. Plus 

337503 Dunham, I. etal. Plus 

337504 Dunham, I. etal. Plus 
337570 Dunham, I. etal. Plus 
337585 Dunham, I. etal. Plus 

40 337629 Dunham, I. etal. Plus 

337670 Dunham, I. etal. Plus 

337674 Dunham, I. etal Plus 

337692 Dunham, I. etal. Plus 

337740 Dunham, I. etal Plus 

45 337755 Dunham. I. etal Plus 

337807 Dunham, I. etal Plus 

337844 Dunham, L etal. Plus 

337902 Dunham, I. etal. Plus 

337904 Dunham, I. etal. Plus 

50 337919 Dunham,!, etal. Plus 

337951 Dunham, I. etal. Plus 

337958 Dunham, I. etal. Plus 

337964 Dunham, L etal. Plus 

338008 Dunham, I. etal. Plus 

55 338053 Dunham, I. etal. Plus 

338057 Dunham, I. eUl. Plus 

^ 338059 Dunhani, I. etal Plus 

338120 Dunham, I. etal. Plus 

338124 Dunham, I. etal Plus 

60 338178 Dunham, I. etal. Plus 

3381 96 Dunham, 1. etal Plus 

338204 Dunham, 1. etal Plus 

338239 Dunham, L etal Plus 

338249 Dunham, 1. etal. Plus 

65 338250 Dunham, I. etal Plus 

338251 Dunham, I. etal. Plus 

338260 Dunham, I. etal Plus 

338282 Dunham, I. etal. Plus 

338316 Dunham, I. etal. Plus 

70 338364 Dunham,!, etal Plus 

338374 Dunham, !. etal Plus 

338454 Dunham, I. etal. Plus 

338494 Dunham, !. etal. Plus 

338596 Dunham, L etal Plus 

75 338622 Dunham, L etal. Plus 



29264457-29264684 

29413020-29413162 

29987731-29987869 

30048054-30048129 

30053441-30053500 

30107247-30107412 

30123235-30123335 

30443138-30443282 

31122315-31122623 

33942937-33943058 

33995071-33995243 

34186130-34186215 

34268953-34269083 

34446383-34446496 

13424060-13424582 

3371522-3371586 

2599641-2599702 

4905608-4905684 

6856506-6856634 

7077262-7077326 

9200300-9200399 

9250034-9250123 

10385555-10386053 

U228329-1 1228403 

11351181-11351274 

11525273-11525527 

12337073-12337258 

12988791-12988889 

14270748-14270816 

23529987-23530214 

23782209-23782374 

23934889-23934962 

27141465-27141776 

29128849-29128974 

33295724-33295872 

33385583-33385857 

33386053-33386236 

359309-359459 

951744-952008 

2017380-2017517 

3110593-3110760 

3332616-3332697 

3575105-3575299 

3870165-3870223 

3971764-3971900 

4444885-4444981 

4993372^993603 

5682218-5682307 

5685819-5686012 

6035207-6035326 

6766321-6766382 

6969162-6969270 

7032720-7032802 

7697068-7697236 

8412742-8412823 

8526397-8526522 

8540638-8540712 

10765673-10765820 

10860311-10860471 

12800037-12800181 

13629317-13629466 

13870980-13871152 

14669918-14670016 

14870864-14870944 

14874504-14874575 

14963460-14963521 

15458919-15459257 

16240812-16241002 

17089711-17089988 

18210049-18210226 

18371200-18371282 

20180035-20180113 

21181818-21182009 

23078273-23078348 

23546552-23546749 
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338702 Dunham, I. etal. 

338707 Dimham, I. eta). 

338716 Dunham, I. etal. 

338765 Dunham, 1. etal. 

338852 Dunham, I. etal. 

338862 Dunham, 1. etal. 

338962 Dunham, I. etal. 

338997 Dunham, I. etal. 

339164 Dunham, I. etal. 

339305 Dunham, I. etal. 

339313 Dunhanv I. etal. 

339319 Dunhann, I. etal. 

339323 Dunham, I. etal. 

339356 Dunham, I. etal. 

339358 Dunham, I. eUl. 

339361 Dunhani,I.etal. 

339413 Dunham, I. etal. 

339418 Dunham, I. etal. 

339436 Dunham, I. etal. 

332813 Dunham, I. etal. 

332854 Dunham, I. etal. 

332858 Dunham, I. etal. 

332863 Dunham, I. etal. 

332868 Dunham, I. etal. 

332884 Dunham, I. etal. 

332886 Dunham, I, etal. 

332896 Dunham, I. etal. 

332929 Dunham, 1. etal. 

332930 Dunham, I. etal. 

332931 Dunham, I. etal. 

332932 Dunham, I. etal. 
332965 Dunham, I etal. 
332995 Dunham, I. etal. 
333002 Dunham, I. etal. 
333011 Dunham, 1. etal. 
333029 Dunham, I. etal. 
333033 Dunham, I. etal. 
333110 Dunham, L etal. 
333126 Dunham, I. etal. 
333217 Dunham, I. etal. 
333220 Dunham, I. etal. 

333242 Dunham, I. etal. 

333243 Dunham, I. etal. 
333259 Dunham, I. etal. 
333270 Dunham, I. etal. 

333278 Dunhami I. etal. 

333279 Dunham, I. etal. 
333358 Dunham, I. eUl. 
333408 Dunham, I. etal. 
333441 Dunhani, L etal. 
333444 Dunham, I. etal. 
333447 Dunhani, 1. etal. 
333466 Dunham, I. etal. 
333473 Dunham, I. etal. 
333496 Dunham, I. etal. 
333511 Dunham, I. etal. 
333542 Dunham, I. etal. 
333568 Dunham, I. etal. 
333582 Dunham, I. etal. 
333665 Dunham, I. etal. 
333673 Dunham, I. etal. 
333734 Dunham, I. etal. 
333964 Dunham, I. etal. 
334156 Dunham, I, etal. 
334172 Dunham, I. etal. 

334183 Dunham, I. etal. 

334184 Dunham, I. etal. 
334223 Dunham, I. etal. 
334270 Dunham, I. etal. 
334288 Dunham, 1. etal. 
334303 Dunham, I. etal. 
334358 Dunham, I. etal. 
334370 Dunham, I. etal. 
334472 Dunham, 1. etal. 
334474 Dunham, I. etal. 



Plus 25219632-25219739 

Plus 25266346-25266417 

Plus 25472519-25472686 

Plus 26657278-26657346 

Plus 28086911-28086971 

Plus 28230332-28230444 

Plus 29581892-29582020 

Plus 30092658-30092730 

Plus 32207441-32207802 

Plus 33334676-33334864 

Plus 33383457-33383585 

Plus 33410900-33410972 

Plus 33418653-33418829 

Plus 33573387-33573517 

Plus 33577760-33577922 

Plus 33580121-33580251 

Plus 34268734-34268875 

Plus 34353362-34353421 

Plus 34546469-34546834 

Minus 318840-318rn 

Minus 1283611-1283053 

Minus 1339607-1339397 

Minus 1389980-1389884 

Minus 1413234-1413078 

Minus 1573063-1572923 

Minus 1574863-1574660 

Minus 1631641-1631422 

Minus 2020758-2020664 

Minus 2022565-2022497 

Minus 2023651-2023562 

Minus 2035348-2035282 

Minus 2537457-2537396 

Mmus 2708847-2708685 

Minus 2537457-2537396 

Minus 2769669-2769571 

Minus 2885241-2885175 

Minus 2889900-2889699 

Minus 3244892-3244779 

Minus 3324305-3324184 

Minus 3967830-3967716 

Minus 3969363-3968789 

Minus 4104544-4104259 

Minus 4104961-4104728 

Minus 4306769-4306639 

Minus 4373573-4373219 

Minus 4414626-4414389 

Minus 4415252-4414844 

Minus 4732336-4732236 

Minus 49368794936661 

Minus 2708847-2708685 

Minus 5070077'.5069643 

Minus 2537457-2537396 

Minus 2708847-2708685 

Minus 2537457-2537396 

Minus 5404643-5404523 

Minus 5557881-555ni8 

Minus 5861529-5861341 

Minus 5965072-5964999 

Minus 6158522-6158322 

Minus 6975471-6975215 

Mmus 7054704-7054602 

Mmus 7535394-7535309 

Minus 8626045-8625966 

Minus 10580883-10580765 

Minus 11644142-11644008 

Minus 11832582-11832508 

Minus 11833848-11833757 

Minus 12734365-12734269 

Minus 13249131-13249007 

Minus 13295104-13294969 

Minus 13454331-13454217 

Minus 13724372-13724201 

Minus 13782655-13782493 

Minus 14391308-14391169 

Minus 14391920-14391809 
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334487 Dunham, 1. etal. 

334500 Dunham, I. etal. 

334537 Dunham, 1. etal. 

334621 Dunham, I. etal. 

334648 Dunham. I. etal 

334764 Dunham, 1. etal. 

334783 Dunham, I. etal. 

334784 Dunham, I. etal. 
334786 Dunham, I. etal, 
334789 Dunham, I. etal. 
334806 Dunham, I. etal. 
334823 Dunham, I. eUl. 
334850 Dunham, I. etal. 
334924 Dunham, I. etal. 
334939 Dunham, I. etal. 
334943 Dunham, I. etal. 
334948 Dunham, I. etal. 
334970 Dunham, I. etal. 
334991 Dunham,!, etal. 
334993 Dunham, I. etal. 
33S062 Dunham, I. etal. 
335207 Dunham, I. etal. 

335288 Dunham, I. etal. 

335289 Dunham, I. etal. 
335293 Dunham, I. etal. 
335332 Dunham. I. etal. 
335389 Dunham, I. etal. 
335478 Dunham, I. etal. 
335524 Dunham, I. etal. 
335547 Dunham, I. etal. 
335671 Dunham, 1. etal. 
335682 Dunham, I. etal. 
335684 Dunham, I. etal. 
335698 Dunham, I. etal. 

335755 Dunham, I. etal. 

335756 Dunham, I. etal. 
335773 Dunham, I. etal. 
335813 Dunham, L etal. 
335815 Dunham, I. etal. 
335817 Dunham, I. etal. 
335827 Dunham, I. etal. 
335829 Dunham, I. etal. 
335836 Dunham, L etal. 
335857 Dunham, I. etal. 
335860 Dunham, 1. etal. 
335871 Dunham, I. etal. 

335896 Dunham, I. etal. 

335897 Dunham, I. etal. 
335903 Dunham, L etal. 
335914 Dunham, I. etal. 
335916 Dunham, I. etal. 
335935 Dunham, I. etal. 
335940 Dunham, I. etal. 
335999 Dunham, I. etal. 
336045 Dunham,!, etal. 
336140 Dunham, L etal. 
336182 Dunham, I. etal. 
336198 Dunham, I. etal. 
336227 Dunham, 1. etal. 
336246 Dunham, I. etal. 
336262 Dunham, I. etal. 
336292 Dunham, I. etal. 
336347 Dunham, I. etal. 
336397 Dunham, I. etal. 
336434 Dunham, I. etal. 
336662 Dunham, I. etal. 

336675 Dunham, I. etal. 

336676 Dunham, I. etal. 

336677 Dunham, I. etal. 

336678 Dunham, I. etal. 
336684 Dunham, I. etal. 
336703 Dunham, I. etal. 
336707 Dunham, I. etal. 
336722 Dunham. 1. etal. 
336778 Dunham, I. etal. 



Minus 14432191-14432132 

Mmus 14486730-14486621 

Minus 14827542-14827354 

Minus 15190418-15190299 

Minus 15363301-15363222 

Minus 16151208-16151104 

Minus 16293336-16293226 

Minus 16294548-16294360 

Minus 16297434-16297275 

Minus 16306095-16305996 

Minus 16433227-16433125 

Minus 16851360-16851189 

Minus 17660892-17660787 

Minus 19744615-19744229 

Minus 20131162-20131054 

Minus 20135064-20134903 

Minus 20141727-20141583 

Minus 20195886-20195554 

Minus 20341858-20341773 

Minus 20354277-20354174 

Minus 20921289-20921087 

Mmus 21763011-21762880 

Minus 22304275-22303770 

Minus 22305950-22305708 

Minus 22316408-22316275 

Minus 22557778-22557557 

Minus 23043682-23043558 

Minus 23924778-23924329 

Minus 2423721 8-24236208 

Minus 24658526-24658460 

Minus 25358629-25358533 

Minus 25421215-25421093 

Minus 254251 65-25425096 

Minus 25493029-25492767 

Minus 25763806-25763747 

Minus 25764330-2576425 1 

Mmus 25880858-25880661 

Minus 263 1 8734-263 1 8649 

Minus 263205 1 8-2632042 1 

Minus 2632 1 875-2632 1 750 

Minus 26380557-26380472 

Minus 26382348-26382251 

Minus 26397823-26397694 

Minus 26677208-26677096 

Minus 26684908-26684800 

Minus 26734972-26734892 

Minus 26977639-26977558 

Minus 26978293-26978142 

Minus 26985739-26985580 

Minus 27024197-27023994 

Mmus 27027028-27026912 

Minus 27360288-27360058 

Minus 27420194-27420000 

Minus 28033986-28033848 

Minus 29044217-29044140 

Minus 30134204-30133980 

Minus 30371411-30371339 

Minus 30459668-30459460 

Minus 30902014-30901946 

Minus 31425669-31425253 

Minus 31833610-31833533 

Minus 32818035-32817927 

Minus 33843218-33843104 

Mmus 34021504-34021389 

Minus 34073056-34072952 

Minus 2158060-2157993 

Minus 2020758-2020664 

Minus 2022565-2022497 

Minus 2023651-2023562 

Minus 2035348-2035282 

Minus 2158060-2157993 

Minus 5071373-5071278 

Minus 2820219-2820111 

Minus 3377722-3377590 

Minus 5071373-507127^ 
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336846 Dtmhain» I. etal. 

336977 Dunham, I. etal. 

336981 Dunham, I. etal. 

337056 Dunham, I. etal. 

337094 Dunham, I. etal. 

337102 Dunham, I. etal. 

337161 Dunham, I. etal. 

337366 Dunham, I. etal. 

337422 Dunham. I. etal. 

337452 Dunham, I. etal. 

337534 Dunham, I. etal. 

337595 Dunham, I. etal. 

337602 Dunham, I. etal. 

337603 Dunham, I. etal. 
337645 Dunham, I. etal. 
337760 Dunham, I. etal. 
337772 Dunham, I. etal. 
337776 Dunham, I. etal. 
337840 Dunham, 1. etal. 
337908 Dunham, 1. etal. 
337937 Dunham, I. etal. 
337974 Dunham, I. etal. 
337986 Dunham, I. etal. 
338000 Dunham, I. etal. 
338221 Dunham, I. etal. 
338510 Dunham, L etal. 
338535 Dunham, I. etal. 
338546 Dunham, I. etal. 
338556 Dunham, I. etal. 
338561 Dunham, I. etal. 
338668 Dunham, I. etal. 
338689 Dunham, I. etal. 
338727 Dunham, I. etal. 
338759 Dunham, I. etal. 
338763 Dunham, I. etal. 
338779 Dunham, I. etal. 
338876 Dunham, I. etal. 
338889 Dunham, I. etal. 
339028 Dunham, I. etal. 
339044 Dunham, I. etal. 
339128 Dunham. I. etal. 
339188 Dunham, I. etal. 
339208 Dunham, I. etal. 
339215 Dunham, I. etal. 
339230 Dunham, I. et.al. 
339256 Dunham, I. etal. 
339280 Dunham, I. etal. 
339370 Dunham, I. etal. 
337895 

329598 3962482 

329563 3962490 

329557 3962492 

329539 3983503 

329526 3983506 

329524 3983507 

329502 3983517 

329503 3983517 
329499 3983518 
329479 3983526 
329625 4567169 
325363 5866920 
325366 5866920 
325433 5866936 
325447 5866941 

325481 5866957 

325482 5866957 
325472 6017034 
325513 6017035 
325519 6017036 
325587 6682462 
325585 6682462 
325594 5866992 
325609 5866996 
325622 5867000 
325751 6682474 



Mttiii<; 


7566306-7566238 




141 1 0003«1 4109910 


A4t111IQ 


1 447R63 8*1 4478472 


Minus 


17975104-17974976 


Minus 


201 4691 S-201 46778 


Minus 


2058 1 73 8-2058 1 628 


Minus 


23473450-23473375 


Minus 


30961904^30961787 


Minus 


32030671-32030417 


Mtnufi 
tnuuMa 


32415187-32415117 


Minus 


341 93847-34193769 


muiusi . 


1 020^06-1 0202 10 


iTiuiua 


12S9Q$t7-1 282741 


l^tflllC 

[viiuua 


1 20Q9Q6-1 2901 94 

I A3r7X70- 1 a77 1 7*t 


iviinua 




iVlUlUd 




IVXlllUd 


40^1918-4061782 


IVllllUO 


40R4S55-4084460 


MtniiQ 


4940540-4Q4040Q 




5697187-5697071 

^\J^ 9 A. U r 9 V 9 E 




6556005-6555907 




71S1401-715308S 


XViUIUd 


72Q6008-7295951 




7530875-7530793 


iviuiUd 


t41${3f)49-141 83568 

l*TiO JU^7— l*Tl 0.^.}U0 


xVXIIiUd 


21 339584-21339508 


iviuiuo 


1 1 1 790174 

X> I / 77U7U-A 1 / yy£t 1 *t 


\^tniic 
IViinUtk 


2201 744R-2201 2383 


\A fn lie 


221 70326-22 1 70234 




2231 1066-2231 1856 




24500606-24500442 


K^ITIIIC 

iViUlUa 


24803073-24802072 


XV11J1U3 


25026788-2502 6580 




26582475-265821 00 


iVlUlUd 


26628148-26628000 


^finus 


27030 1 5 1 -27020705 


IVilllUS 


28364326-2836407 1 


A4iniic 

iViJJJU4> 


28477552-2847741 2 

XOt/ /.JJX— XrOt/ /*TlXr 


ivLinusi 


305741 22-30573037 


iviinus 


3072 1 853-30721 740 


IVIUIUS 


3 1 60281 5-3 1 602686 




32347554-32347250 


Minus 


32491714-32491657 


Minus 


32502559-32502383 


IVIIIIUS 


32720004-32728020 


Nfinus 


32026055-32025067 


KiftniiQ 

iVlIIlUS 


33114230-33114010 




3380501 2-33805797 


Plus 


39024^0220 


IViUlUSl 


410.^35 


IVllllUo 


53197-53647 


Kiftntie 


1-326 




12251-12325 


iviinus 


38025-3R143 


Plus 


75-338 


iVilllUo 


1 801-1037 

I Ovll 1 7iJ / 


Plus 


33463-33780 


IVlllllio 


7425-7561 

/*Ti6j— /./Vl 


IVllIlUo 


85803-85084 


Plus 


700446-700516 




020062-021713 

7-6v7V*. 7^ L f l-J 


Mimic 


480706-480826 


I^inus 


372480-372621 


Plus 


47500-47672 


Plus 


47057-48078 


Nlinus 


280581-280657 


I^inus 


34205-34400 

J*Tifc7 J Jt*f 7\/ 


IVlUlUd 


186804-186015 


Plus 


126724-126967 


Plus 


73476-73574 


Minus 


470474-470566 


Minus 


981751-981849 


Plus 


69994-70075 


Plus 


130437-130520 
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325815 6682483 

329762 6048280 

329789 6469354 

329797 6523160 

5 329838 6672062 

325864 5867069 

325885 5867087 

325892 5867088 

325929 5867125 

10 325843 6552453 

329989 4567166 

329960 5091594 

329959 5103803 

329936 6165200 

IS 329919 6223624 

330004 6623963 

326025 5867176 

326054 5867184 

326112 5867192 

20 326165 5867208 

326213 5867224 

326219 5S67226 

326160 5867254 

326257 5867264 

25 330057 6478962 

326359 5867293 

326393 5867341 

326399 5867353 

326401 5867355 

30 326416 5867362 

326431 5867371 

326460 5867400 

326517 5867439 

326519 5867439 

35 326596 6138928 

330081 6015314 

326714 5867595 

326752 5867615 

326757 6249610 

40 326668 6552455 

326720 6552456 

326725 6552456 

326862 6552465 

326882 6682509 

45 326892 6682511 

326996 5867660 

327010 5S67664 

326919 6456782 

327042 6531965 

50 327049 6531965 

327072 6531965 

327074 6531965 

327075 6531965 
326981 6588016 

55 327133 6682522 

330137 4210430 

330138 4210430 
330143 4210430 
330153 4325335 

60 330135 4456470 

327205 5867447 

327212 5867463 

327287 5867479 

327331 5S67516 

65 327364 6552412 

327413 5867750 

327481 5867783 

327458 6004455 

327516 6117815 

70 327527 6381882 

327548 5867797 

327554 5867801 

327565 5867811 

327600 6004462 

75 327687 5867847 



Minus 


129273-130754 


Pius 


127744-127878 


Minus 


118977-119036 


Minus 


10616-10894 


Plus 


33990-34098 


Minus 


.110834-110904 


Plus 


193212-193377 


Minus 


10498-10652 


Minus 


51715-51996 


Minus 


7126-7232 


Plus 


72861-73052 


Minus 


1031-1162 


Plus 


188050-188193 


Minus 


82761-82920 


Minus 


103492-103681 


Minus 


78872-78999 


Plus 


70854-70915 


Minus 


146342-146469 


Plus 


2151-2725 


Minus 


62787-62929 




60751-60927 




264008-264274 


Minus 


112000-112137 


Plus 


???717-??^819 


Plus 


75145-75287 


Plus 


9436-9494 


Plus 


41702-41841 


Plus 


6385-6536 


Plus 


35165-35332 


Minus 


45283-45375 


Plus 


15855-15971 


Minus 


142633-142935 


Plus 


44732-46356 


Plus 


166004-166243 


Plus 


133386-133563 


Minus 


5768-5835 


Plus 


124490-124568 


Minus 


1214-1562 


Plus 


74531-74597 


Plus 


146726-146838 


Plus 


84525-84677 


Minus 


223005-223125 


Plus 


107702-107782 


Minus 


167988-168179 


Plus 


119424-119500 


Minus 


63212-63404 


Plus 


941057-941139 


Minus 


40486-41046 


Minus 


1380806-1381443 


Minus 


1924026-1924110 




3796429^3797197 


Plus 


4039993-4040096 


Plus 


4041318-4041431 


Plus 


105091-106038 


Plus 


38069-38938 


Minus 


21220-21377 


Minus 


22334-22460 


Plus 


184737-184848 


Plus 


146951-147475 


Minus 


121583-121885 


Plus 


167335-167576 


Minus 


42308-42424 


Mtnii<« 


62838-63024 


Minus 


55606-55737 


Minus 


115235-115396 


Plus 


101410-101508 


Plus 


104472-104673 


Plus 


173257-173378 


Plus 


199078-199216 


Minus 


98950-99040 


Minus 


81067-81130 


Minus 


23092-23191 


Plus 


32516-32778 


Minus 


2621-2862 


Minus 


169293-169362 
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330182 5123954 

327742 5867944 

327805 5867968 

327809 5867968 

5 327814 5867968 

327815 5867968 

327791 5867977 

327745 6531959 

330211 6013592 

10 330207 6013606 

330257 6671881 

330262 6671884 

330286 6671913 

328105 5868020 

15 328113 5868024 

328142 5868050 

328152 5868060 

328170 5868071 

327910 5868162 

20 327919 5868165 

327990 5868218 

328249 6381891 

328251 6381891 

328253 6381894 

25 328084 6469819 

328274 5868219 

328615 5868239 

328632 5868247 

328779 5868309 

30 328783 5868309 

328801 5868321 

328820 5868330 

328835 5868339 

328290 5868363 

35 328321 5868373 

328332 5868375 

328333 5868375 
328349 5868383 
328450 5868425 

40 328466 5868434 

328479 5868449 

328481 5868449 

328546 5868487 

328662 6004473 

45 328767 6017031 

328768 6017031 

328857 6381927 

328878 6552423 

328882 6552423 

50 328690 6588001 

328691 6588001 

330307 4877982 

328903 5868514 

328987 5868535 

55 328998 5868538 

329062 5868590 

329086 smm 

329154 5868686 

329156 5868686 

60 329164 5868691 

329170 5868693 

329179 5868704 

329193 5868716 

329254 5868733 

65 329369 5868842 

329367 5868842 

329141 6017060 

329347 6456785 

329017 6682532 

70 329434 5868883 



Plus 


120156-120245 


Minus 


143307-143512 


Plus 


19952-20019 


Plus 


54610-54761 


Plus 


69377-70566 


Plus 


70804-71401 


Plus 


22491-22610 


Minus 


229066-229124 


Plus 


59158-59215 


Nfixius 


109912-110004 


Minus 


143228-143393 


Plus 


67913-68053 




31050-31171 


MtniiQ 
iviuiua 


301705-301784 


IvlUtUa 


80378-80491 




9656-9778 




73981-74203 


Plus 


93170-93295 


Plus 


21622-21748 


Plus 


547701-547800 


Minus 


36225-36503 


Minus 


96352-96527 


Plus 


124444-124557 


MinuQ 


4411-4509 




155366-155459 




31244-31439 


Plus 


35214-35347 


Plus 


76734-76853 


Plus 


41570-41639 


AAtTlilC 


73fi58-73R22 


^/f imiQ 


44492-44609 


Plus 


90446-90602 


Plus 


88053-88461 




127366-127496 


A/finii'! 


1029614-1029673 


Plus 


280154-280289 


Plus 


282506-282664 


IV^iniic 


260704-260804 


IV&UtU9 


209192-209321 


AAiniic 


15fi43-159fl0 


I^uius 


331-560 


IV^iniiQ 


8987-9180 


MimiQ 


17547-17722 


Plus 


1184773-1184855 


IVlUlUa 


35625-35723 


AyfimiQ 

IVIUIUS 


223741-224238 




80^57-81051 


Plus 


105580-105774 


Minus 


157669-157826 


Minus 


571207-571274 


Minus 


579598-579664 


Plus 


107384-107559 


Plus 


23625-24468 


Minus 


25705-25764 


Plus 


40996-41104 


MtniiQ 


58977-59094 


MinuQ 


35489-35588 


Minus 


200851-201356 


Minus 


202013-202341 


Plus 


62305-62517 


Plus 


67924-68019 


Plus 


181639-181815 


Plus 


168095-168181 


Plus 


4133-4214 


Minus 


121148-121516 


Minus 


87201-87587 


Plus 


343924-343997 


Plus 


18433-18897 


Minus 


255591-255672 


Minus 


31124-31263 
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TABLE 2 DNA AND PROTEIN SEQUENCES FOR CBF9 AND BF08 

Table 2 provides the nucleic acid and protein sequence of the CBF9 and BF08 genes as well 
as the Unigrae and Exemplar accession numbers for CBF9 and BF08. 



CBF9 DNA SEQUENCE 



Gene name: 
Unigene number: 
Probeset Accession #: 
Nucleic Acid Accession #: 
Coding Sequence: 



ESTs 

Hs. 157601 

W07459 

AC005383 

328-2751 (underlined sequences correspond to start and 
stop codons) 



1 11 21 

I 1 I 

GACAGTGTTC GCGGCTGCAC CGCTCX3GAGG 
TTTTATTTGC AGACCTGGGC 0GATGCCX3CT 
CCTGGCGGTA GTTCCTCCGA CCTCAGCCGG 
ACAAACAGGT GTCCCACX3TG GCAGCCGCGC 
CCCCCTGGCC CGAGCCGCGC CCGGGTCTGT 
TCGCCGCTCT CCTTCCGTTA TATCAACATG 
GTTTTCCTGT TTTCCAGAGT GCCCCCATCT 
GAAACCATCG GGAAGATTTC AGCTGCCAGC 
ATCATGTTTC TGTTAGATGG GTCTAACAGC 
CACTTTGCCA TCACAGTCTG TGACGGTCTG 
GCATTCCAGT TCAGTTCCAC TCCTCATCTG 
CAGGAAGTGA AGGCAAGAAT CAAGAGGATG 
CTTGCTCTGA AATACCTTCT GCACAGAGGG 
CAGATCCTCA TCATCGTCAC TGATGGGAAG 
CAGCTGAAGG AAAGGGGTGT CACTGTGTTT 
GAGCTGCATG CACTGGCCAG CGAGCCTAGA 
GAGGATGCCA CCAACGGCCT CTTCAGCACC 
ACGCCAGACT GCA6GGTCGA GGCTCACCCC 
GAGTTCGCTG GCAATGCCCC ATGCTGGAGA 
GCACACTGTC CCTTCTACAG CTGGAAGAGA 
AGGACCACCT GCCCAGGCCC CTGTGACTCG 
CCAGAAGGAC TGGACGGCTA CCAGTGCCTC 
TGTGCCCTGA AGCTGAGCCT GGAATGCAGG 
GCGGGCACCA CTCTGGACGG CTTCCTGCGG 
GCCGTGCTGA GCGAGGACTC TCGGGCCCGA 
CTGGTGGCGG TGCCTGTGGG GGAGTACCAG 
GGCATTCCCT TCCGTGGTGG CCCCACCCTG 
CGTGGCTTCG GGAGCGCCAC CAGGACAGGC 
CTCACTGAGT CACACTCCGA GGATGAGGTT 
GAGCTGCTCC TGCTGGGTGT AGGCAGTGAG 
GGCAGCCCAA AGCATGTGAT GGTCTACTCG 
GAGCTGCAGG GGAAGCTGTG CAGCCXK3CAG 
CTCGTCTTCA TGTTGGACAC CTCTGCCTCA 
AGCTTTGTGA GAAGCTGTGC CCTCCAGTTT 
CTGGTGGTGT ATGGCAGCCA GGTGCAGACT 
GCTGCGATGC TGCGGGCCAT TAGCCAGGCC 
ACCGCCCTGC TGCACATCTA TGACAAAGTG 
GTCCCCAAAG CTGTGGTGGT GCTCACAG6C 
GCCCAGAAGC TGAGGAACAA TGGCATCTCT 
A6TGAGGGTC TGCGGAGGCT TGCAGGTCCC 
GCCGACCTGC GGTACCACCA GGACGTGCTC 
CCAGTCAACC TCTGCAAACC CAGCCCGTGC 
GGGAGCTACC GCTGCAAGTG TCGGGATGGC 
TGGAGCTCTT GCTCTGTATG TGTGAGCCAG 
ATGGCTCCCG TGCAGGAGGG CAGCAGCCGT 
GGCACTGAAA TGGTGCCTAC CTTCTGGAAT 
TTCCCGCCGT GGCCAGGACC ACTATTCTCA 



31 41 51 

I 1 I 

CTGGGTGACC CGCGTAGAAG TGAAGTACTT 60 

TTAAAAAACG CGAGGGGCTC TATGCACCTC 120 

GTCGGGTCGT GCCGCCCTCT CCCAGGAGAG 180 

CCCGGGCGCC CCTCCTGTGA TCCCGTAGCG 240 

GAGTAGAGCC GCCCGGGCAC CGAGCGCTGG 300 

CCCCCTTTCC TGTTGCTGGA GGCCGTCTGT 360 

CTCCCTCTCC AGGAAGTCCA TGTAAGCAAA 420 

AAAATGATGT GGTGCTCGGC TGCAGTGGAC 480 

GTCGGGAAAG GGAGCTTTGA AAGGTCCAAG 540 

GACATCAGCC CCGAGAGGGT CAGAGTGGGA 600 

GZ^TTCCCCT TGGATTCATT TTCAACCCAA 660 

GTTTTCAAAG GAGGGCGCAC GGAGACGGAA 720 

TTGCCTGGAG GCAGAAATGC TTCTGTGCCC 780 

TCCCAGGGGG ATGTGGCACT GCCATCCAAG 840 

GCTGTGGGGG TCAGGTTTCC CAGGTGGGAG 900 

GGGCAGCACG TGCTGTTGGC TGAGCAGGTG 960 

CTCAGCAGCT CGGCCATCTG CTCCAGCGCC 1020 

TGTGAGCACA GGACGCTGGA GATGGTCCGG 1080 

GGATCGCGGC GGACCCTTGC GGTGCTGGCT 1140 

GTGTTCCTAA CCCACCCTGC CACCTGCTAC 1200 

CAGCCCTGCC AGAATGGAGG CACATGTGTT 1260 

TGCCCGCTGG CCTTTGGAGG GGAGGCTAAC 1320 

GTCGACCTCC TCTTCCTGCT GGACAGCTCT 1380 

GCCAAAGTCT TCGTGAAGCG GTTTGTGCGG 1440 

GTGGGTGTGQ CCACATACAG CAGGGAGCTG 1500 

GATGTGCCTG ACCTGGTCTG GAGCCTCGAT 1560 

ACGGGCAGTQ CCTTGCGGCA GGCGGCAGAG 1620 

CAG6ACCGGC CACGTAGAGT GGTGGTTTTG 1680 

GCGGGCCCAG CGCGTCACGC AAGGGCGCGA 1740 

GCCGTGCGGG CAGAGCTGGA GGAGATCACA 1800 

GATCCTCAGG ATCTGTTCAA CCAAATCCCT 1860 

CGGCCAGGGT GCCGGACACA AGCCCTGGAC 1920 

GTAGGGCCC6 AGAATTTTGC TCAGATGCAG 1980 

GAGGTGAACC CTGACGTGAC ACAGGTCGGC 2040 

GCCTTCGGGC TGGACACCAA ACCCACCCGG 2100 

CCCTACCTAG GTGGGGTGGG CTCAGCCGGC 2160 

ATGACCGTCG AGAGGGGTGC CCGGCCTGGT 2220 

GGGAGAGGCG CAGAGGATGC AGCCGTTCCT 2280 

GTCTTGGTCG TGGGCGTGGG GCCTGTCCTA 2340 

CGGGATTCCX! TGATCCACX5T 6GCAQCTTAC 2400 

ATTGAGTGGC TGTGTGGAGA AGCCAAGCAG 2460 

ATGAATGAGG GCAGCTGCGT CCT6CAGAAT 2520 

TGGGAGGGCC CCCACTGCGA G7VACCGTGAG 2580 

GGATGGATTC TTGAGACGCC CCTGAGGCAC 2640 

ACCCCTCCCA GCAACTACAG AGAAGGCCTG 2700 

GTCTGTGCCC gAGGTCC OTA GA ATGTCTGC 2760 

CTGAGGGAGG AGGATGTCCC AACTGCAGCC 2820 
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ATGCTGCTTA 
TTGATGTGTA 
CTGCCACCTT 
CGTTCCTTTG 
AGGCCTTTAC 
GCA6CTTTTC 
CTTGAGGGAC 
GGTCTCAGAC 
TGTGCATGGG 
ACCTTGAAGG 



GAGACAAGAA 
AGTAAATACC 
TCCCTTGAGG 
CACACAATCA 
TAGAGCATCC 
CACTTCCCCA 
GTTTGTGACT 
TGAATGTGAC 
CCCAGGTCTG 
TCTTC 



AGCAGCTGAT 
CACTTTCTGT 
ATAAACAAGG 
ATGCTCGCCA 
TTTGGACX3GC 
GAGACATTCT 
TCTTGGCGAC 
CAATTAACCA 
GAGGGCCACG 



GTCACCCACA 
ACCTGCTGTG 
GGTCCTGAAG 
GAATGTTGTT 
GAAGGCCACG 
6GATGCATTT 
TGCCTTTTGT 
GCTTGGTTGA 
TAAAATCXxTT 



AACGATGTTG 
CCTTGTT6AG 
ACTTAAATTT 
GACACAGTAA 
GCCTTTCAAG 
GCATTGACTC 
GTGTGGAAGA 
TGATGGGGGA 
CTGAGTC5QTG 



TTGAAAAGTT 
GCTATGTCAT 
AGCGGCCTGA 
TGCCCAGCAG 
ATGGAAAGCA 
TGAAAGGGGG 
GACTTGGAAA 
GGGGCTGAGT 
AGCA6TGTCC 



2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 



Gene name; 
Unigene number: 

Signal sequence: 
Transmembrane domains: 
VGW domains: 
EGF domains: 
Cellular Localization: 



CBF9 Protein sequence 

BSTs 

HS. 157601 

1-17 

none found 

49-223; 341-518; 529-706 
298-333; 715-748 
plasma membrane 



Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

MPPFIililiEAV CVFLFSRVPP SLPLQEVHVS KBTIGKISAA SKMMWCSAAV DIMFLLDGSN 60 

SVGKGSFERS KHFAITVCDG LDISPERVRV GAFQFSSTPH XiEPPLDSFST QQEVKARIKR 120 

MVFKGGRTET ELALKYLIJIR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAV6VRFPRW EELHAIiASEP RGQHVLLAEQ VEDATNGLFS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLEMV REFAGNAPCW RGSRRTLAVD AAHCPFYSWK RVFLTHPATC YRTTCPGPCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDLLFLLDS SAGTTLDGPL 360 

RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVPDLVWSL DGIPFRGGPT 420 

LTGSALRQAA BRGFGSATRT GQDRPRRVW LLTESHSEDE VAGPARHARA RBIiLLIiGVGS 480 

EAVRAELEEI TGSPKHVMVY SDPQDIiPNQI PEIiQGKLCSR QRPGCRTQAL DLVFMLDTSA 540 

SVGPENFAQM QSFVRSCALQ FEVNPDVTQV GIiWYGSQVQ TAPGLDTKPT RAAMLRAISQ 600 

APYLGGVGSA GTALLHIYDK VMTVQRGARP GVPKAVWLT GGRGAEDAAV PAQKLRNNGI 660 

SVLWGVGPV LSEGIiRRIAG PRDSLIHVAA YADriRYHQDV LIEWLCGEAK QPVNLCKPSP 720 

CMNEGSCVLQ NGSYRCKCRD GWEGPHCENR EWSSCSVCVS QGWILETPLR HMAPVQEGSS 780 
RTPPSNYREG LGTEMVPTPW NVCAPGP 



BF08 DNA SEQUENCE 



Gene name: 
Unigene number: 
Probeset Accession #: 
Nucleic Acid Accession #: 
Coding sequence: 



TMPRSS3a 
HS. 298241 
AI538613 
AB038157 

202-1566 (underlined sequences correspond to start and 
stop codons) 



1 11 21 31 41 51 

I I I I I 1 

ACCGGGCACC GGACGGCTCG GGTACTTTCG TTCTTAATTA GGTCATGCCC GTGTGAGCCA 60 

GGAAAGGGCT GTGTTTATGG GAAGCCAGTA ACACTQTGGC CTACTATCTC TTCCGTGGTO 120 

CCATCTACAT TTTTGGGACT CGGGAATTAT GAGGTAGAGG TGGAGGCGGA GCCGGATGTC 180 

AGAGGTCCTG AAATAGTC7VC CASGGGGGAA AATGATCCGC CTGCTGTTGA AGCCCCCTTC 240 

TCATTCCGAT CGCTTTTTGG CCTTGATGAT TTGAAAATAA GTCCTGTTGC ACCAGATGCA 300 

GATGCTGTTG CTGCACAGAT CCTGTCACTG CTGCCATTGA AGTTTTTTCC AATCATCGTC 360 

ATTGGGATCA TTGCATTGAT ATTAGCACTG GCCATTGGTC TGGQCATCCA CTTCGACTGC 420 

TCAGGGAAGT ACAGATGTCG CTCATCCTTT AAGTGTATCG AGCTGATAGC TCGATGTGAC 480 

GOAGTCTCGG ATTGCAAAQA CGGGGAQGAC GAGTACCGCT GTGTCCGGGT GGGTGGTCAG 540 

AATGCCGTGC TCCAGGTGTT CACAGCTGCT TCGTGGAAGA CCATGTGCTC CGATGACTGG 600 

AAGGGTCACT ACX3CAAATGT TGCCTGTGCC CAACTGGGTT TCCCAAGCTA TGTGAGTTCA 660 

GATAACCTCA GAGTGAGCTC GCTGGAGGGG CAGTTCCGGG AGGAGTTTGT GTCCATCGAT 720 

CACCTCTTGC CA6ATGACAA GGTGACTGCA TTACACCACT CAGTATATGT GAGGGAGGGA 780 
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TGTGCCTCTG GCCACGTGGT TACCTTGCAG TGCACAGCCT GTGGTCATAG AAGGGGCTAC 840 

AGCTCACGCA TCGTGGGTGG AAACATGTCC TTGCTCTCGC AGTGGCCCTG GCAGGCCAGC 900 

CTTCAGTTCC AGGGCTACCA CCTGTGCGGG G6CTCTGTCA TCACX3CCCCT GTGGATCATC 960 

ACTGCTGCAC ACTGTGTTTA TGACTTGTAC CTCCCCAAGT CATGGACCAT CCAGGT6GGT 1020 

5 CTAGTTTCCC TGTTGGACAA TCCAGCCCCA TCCCACTTGG TGGAGAAGAT TGTCTACCAC 1080 

AGCAAGTACA AGCCAAAGAG GCTGGGCAAT QACATCGCCC TTATGAAGCT GGCOGGGCCA 1140 

CTCACGTTCA ATGAAATGAT CCAGCCTGTG TGCCTGCCCA ACTCTGAAGA QAACTTCCCC 1200 

GATGGAAAAG TGTGCTGGAC GTCAGGATGG GGGGCCACAG AGGATGGAGC AGGT6ACGCC 1260 

TCCCCTGTCC TGAACCACGC GGCCGTCCCT TTGATTTCCA ACAAGATCTG CAACCACAGG 1320 

10 GACX3TGTACG GTGGCATCAT CTCCCCCTCC ATGCTCTGCG CGGGCTACCT GACGGGTGGC 1380 

GTGGACAGCT GCCAGGGGGA CAGC6GGGGG CCCCTGGTGT GTCAAGAGAG QAGGCTGTGG 1440 

AAGTTAGTGG GAGCXSACCAG CTTTGGCATC GGCTGCX3CAG 7U3GTGAACAA GCCTGGGGTG ' 1500 

TACACCCGTG TCACCTCCTT CCTGGACTGG ATCCAC6AGC AGATGGAGAG AGACCTAAAA 1560 

ACCTGAAGAG GAAGGGGACA AGTAGCCACC TGAGTTCCTG AGGTGATGAA GACAGCCCGA 1620 

15 TCCTCCCCTG GACTCCCGTG TAGGAACCTG CACACGAGCA GACACCCTTG GAGCTCTGAG 1680 

TTCCGGCACC AGTAGCAGGC CCGAAAGAGG CACCCTTCCA TCTGATTCCA GCACAACCTT 1740 

CAAGCTGCTT TTT6TTTTTT GTTTTTTTGA GGTGGAGTCT CGCTCTGTTG CCCAGGCTGG 1800 

A6TGCAGTQG CX3AAATCCCT GCTCACTGCA GCCTCOSCTT CCCTGGTTCA AGCGATTCTC 1860 

TTGCCTCA6C TTCCCCAGTA GCTGGGACCA CAGGTGCCCG CCACCACACC CAACTAATTT 1920 

20 TTGTATTTTT AGTAGAGACA G6GTTTCACC ATGTTGGCCA GGCTGCTCTC AAACCCCTGA 1980 

CCTCAAATGA TGTGCCTGCT TCAGCCTCCC ACAGTGCTGG GATTACAGGC ATGGGCCACC 2040 

ACGCCTAGCC TCACGCTCCT TTCTGATCTT CACTAAGAAC AAAAGAAGCA GCAACTTGCA 2100 

AGGGCGGCCT TTCCCACTGG TCCATCTGGT TTTCTCTCCA GGGGTCTTGC AAAATTCCTG 2160 

ACGAGATAAG CAGTTATGTG ACCTCACXSTG CAAAGCCACC AACAGCCACT CAGAAAAGAC 2220 

25 GCACCAGCCC AGAAGTGCAG AACTGCAGTC ACTGCACGTT TTCATCTCTA GGGACCAGAA 2280 

CCAAACCCAC CCTTTCTACT TCCAAGACTT ATTTTCACAT GTGGGGAGGT TAATCTAGGA 2340 

ATGACTCGTT TAAGGCCTAT TTTCATGATT TCTTTGTAGC ATTTGGTGCT TGACGTATTA 2400 

TTGTCCTTTG ATTCCAAATA ATATGTTTCC TTCCCTCAAA AAAAAAAAAA AAAAAAAAAA 2460 
AAAAAAAA 

30 



BF08 Protein sequence; 



35 Gene name: TMPRSS3a 

Unigene number: Hs. 298241 

Probeset Accession #: AI538613 

Protein Accession #i BAB20077 

Signal sequence: none found 

40 Transmembrane domains: 43-65, 239-261 

Tryp_SPc domain: 216-444 

Cellular Localization: not determined 



45 1 11 21 31 41 51 

I I I I i I 

MGENDPPAVB APFSFRSLFG LDDIiKISPVA PDADAVAAQI LSUjPLKFFP XXVIGtlALI 60 

IiALAIGLGXH FDCSGKYRCR SSFKCZELIA RCDGVSDCKD GEDEVRCVRV GGQNAVLQVF 120 

TAASWKTMCS DDWKGHYANV ACAQLGFPSY VSSDNLRVSS LEGQFREEFV SIDHLLPDDK 180 

50 VTALHHSVYV REGCASGHW TLQCTACGHR RGYSSRIVGG NMSLLSQWPW QASLQPQGYH 240 

LCGGSVITPL WIITAAHCVY DLYLPKSWTI QVGLVSliIiDN PAPSHLVBKI VYHSKYKPKR 300 

LGNDIALMKL AGPLTFNEMI QPVCLPNSEE NFPDGKVCWT SGWGATEDGA GDASPVLNHA 360 

AVPLISNKIC NHRDVYGGII SPSMLCAGYL TGGVDSCQGD SGGPIiVCQER RLWKLVGATS 420 
FGIGCAEVNK PGVYTRVTSF LDWIHEQMER DLKT 



55 
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WHAT IS CLAIMED IS: 

1 1. A method of screening drag candidates comprising: 

2 a) providing a cell that expresses an expression profile gene selected from the 

3 group consisting of an expression profile gene set fortti in Table 1 or Table 2 or fragment 

4 thereof; 

5 b) adding a drag candidate to said cell; and 

6 c) determining the effect of said drag candidate on the egression of said 

7 expression profile gene. 

1 2. A method according to claim 1 wherein said determining comprises 

2 comparing the level of expression in the absence of said drag candidate to the level of 

3 expression in the presence of said drag candidate. 

1 3. A method of screening for a bioactive agent capable of binding to a 



2 colorectal cancer modulator protein (colorectal cancer modulator protein), wherein said 

3 colorectal cancer modulator protein is encoded by a nucleic acid selected fi-om the group 

4 consisting of a nucleic acid of Table 1 or Table 2 or a fragment thereof said method 

5 comprising: 



6 a) combining said colorectal cancer modulator protein and a candidate 

7 bioactive agent; and 

8 b) determining the binding of said candidate agent to said colorectal cancer 

9 modulator protein. 

1 4. A method for screening for a bioactive agent cepeble of modulating the 

2 activity of a colorectal cancer modulator protein, wherein said colorectal cancer modulator 

3 protein is encoded by a nucleic acid selected from the group consisting of a nucleic acid of 

4 Table 1 or Table 2 or a fragment fliereof, said method comprising: 

5 a) combining said colorectal cancer modulator protein and a candidate 

6 bioactive agent; and 
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7 b) determining the effect of said candidate agent on the bioactivity of said 

8 colorectal cancer modulator protein. 

1 5. A method of evaluating the effect of a candidate colorectal cancer drug 

2 comprising: 

3 a) administering said drug to a patient; 

4 b) removing a cell sample firom said patient; and 

5 c) determining the e5q)ression of a gene selected from the group consisting of a 

6 nucleic acid of Table 1 or Table 2. 

1 6. A method according to claim 5 further comprising comparing said 

2 expression profile to an expression profile of a healthy individual. 

1 7. A method ofdiagnosing colorectal cancer comprising: 

2 a) detemiining the expression of one or more genes selected fix>m the group 

3 consisting of a nucleic acid of Table 1 or Table 2 or a fragment thereof or a polypeptide 

4 encoded thereby in a first tissue type of a first individual; and 

5 b) comparing said expression of said gene(s) from a second normal tissue type 

6 from said first individual or a second unaffected individual; 

7 wherein a difference in said expression indicates that the first individual has 

8 colorectal cancer. 

1 8. A method for screening for a bioactive agent capable of interfering with the 

2 binding of a colorectal cancer modulator protein (colorectal cancer modulator protein) or a 

3 fragment thereof and an antibody which binds to said colorectal cancer modulator protem or 

4 fragment thereof, said method comprising: 

5 a) combining a colorectal cancer modulator protein or fragment thereof, a 

6 candidate bioactive agent and an antibody which binds to said colorectal cancer modulator 

7 protein or fragment diereof; and 

8 b) determining the binding of said colorectal cancer modulator protein or 

9 fragment thereof and said antibody. 
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1 9. A method for inhibiting the activity of a colorectal cancer modulator 

2 protein (colorectal cancer modulator protein), wherein said colorectal cancer modulator 

3 protein is encoded by a nucleic acid selected from the group consisting of a nucleic acid of 

4 Table 1 or Table 2 or a fragment thereof, said method comprising binding an inhibitor to said 

5 colorectal cancer modulator protein. 

1 10. A method according to claim 9 wherein said inhibitor is an antibody. 

1 1 1 . A method of treating colorectal cancer comprising administering to a 

2 patient an inhibitor of a colorectal canc^ modulator protein, wherein said colorectal cancer 

3 modulator protein is encoded by a nucleic acid selected from the group consisting of a 

4 nucleic acid of Table 1 or Table 2 or a fragment thereof 

1 12. A method according to claim 1 1 wherein said inhibitor is an antibody. 

1 13. A method of neutralizing the effect of a colorectal cancer modulator 

2 protein, or a fragment thereof, comprising contacting an agent specific for said protein with 

3 said protein in an amount sufiScient to effect neutralization. 

1 14. A method for localizing a therapeutic moiety to colorectal cancer tissue 

2 comprising exposing said tissue to an antibody to a colorectal cancer modulator protein or 

3 fragment thereof conjugated to said ther^eutic moiety. 

1 15. The method of Claim 14, wherein said therapeutic moiety is a cytotoxic 

2 agent. 

1 16. The method of Claim 14, wherein said ther^eutic moiety is a 

2 radioisotope. 

1 17. A method for inhibiting colorectal cancer in a cell, wherein said method 

2 comprises administering to a cell a composition comprising antisense molecules to a nucleic 

3 acid of Table 1 or Table 2. 

1 1 8. An antibody which specifically binds to a protein encoded by a nucleic 

2 acid of Table 1 or Table 2 or a firagment thereof 
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1 19. The antibody of Claim 1 8, wherein said antibody is a monoclonal 

2 antibody. 

1 20. The antibody of Claim 1 8, wherein said antibody is a himianized 

2 antibody. 

1 21. The antibody of Claim 18, wherein said antibody is an antibody fragment. 

1 22. A biochip comprising one or more nucleic acid segments selected from 

2 the group consisting of a nucleic acid of Table 1 or Table 2 or a fragment thereof, wherein 

3 said biochip comprises fewer than 1000 nucleic acid probes. 

1 23. A nucleic acid having a sequence at least 95% homologous to a sequence 

2 of a nucleic acid of Table 1 or Table 2 or its complement. 

1 24. A nucleic acid which hybridizes under high stringency to a nucleic acid of 

2 Table 1 or Table 2 or its complement. 

1 25. A polypeptide encoded by the nucleic acid of Claim 23 or 24. 

1 26. A method of eliciting an immune response in an individual, said method 

2 comprising administering to said individual a composition comprising the polypeptide of 

3 Claim 25 or a fragment thereof 

1 27. A method of eliciting an immune response in an individual, said method 

2 comprisiQg administering to said individual a composition comprising a nucleic acid 

3 comprising a sequence of a nucleic acid of Table 1 or Table 2 or a fragment thereof 

1 28. A method ofdetermining the prognosis of an individual with colorectal 

2 cancer comprising: 

3 a) determining the expression of one or more genes selected from the group 

4 consisting of a nucleic acid of Table 1 or Table 2 or a fragment thereof in a first tissue type of 

5 a first individual; and 

6 b) comparing said expression of said gene(s) from a second noraial tissue type 

7 from said first individual or a second unaffected individual; 
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8 wherein a substantial difference in said expression indicates a poor prognosis. 

1 29. A method of treating colorectal cancer comprising administering to an 

2 individual having colorectal cancer an antibody to a colorectal cancer modulator proteiu or 

3 fragment thereof conjugated to a ther^eutic moiety. 

1 30. The method ofClaim 29, wherein said therapeutic moiety is a cytotoxic 

2 agent. 

1 31. The method of Claim 29, wherein said therapeutic moiety is a 

2 radioisotope. 
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Continuation of Box I Reason 2: 

Claims 1-7, 9- 12, and 17-28 have been found to be unsearchabie under Article 17(2)(b) because of defects under Article 17(2Ka). 
More particularly, claims 1-7, 9-12, and 17-28 are drawn to ejqpression profHe genes set forth in Table 1 or Table 2 or fragment 
thereof, but Table 1 does not set forth die sequence of the expression profile genes and ^^e Table 2 sets forth the sequence of die 
e;q>ressiQn profile genes. Table H does not identify the sequences by a sequence identification number that corresponds to the 
identical sequence contaiaed in die Sequence Listing on tlK Conpiter Readable Format Therefore, the claims could not be searched 
because the sequences to vMcHx the daims refa are not disclosed or cannot be searched. 

BOX n* OBSERVATIONS WHERE UNITY OF INVENTION IS L ACEING 

This q>plicatiaa contains the following inventions or groins of inventions which are not so linked as to form a sixigle general 
inventive concept under POT Rule 13. 1 . In order for all inventions to be examined, die q>propriate additional examination fees must 
be paid. 

Groiq> I, claim(s) 8^ drawn to a mediod for screening for a bioactive agent. 

Group n, claim(s) 13, drawn to a metiiod for neutralizing die effect of a colorectal cancer modulator or a fragment thereof. 
Group m, claim(s) 14- 16, drawn to a method for localizing a therq>eutic moie^ to a colorectal cancer tissue. 
Group IV, claim(s) 2^31, drawn to a method for treating cdorectal cancer. 

The inventions listed as Groi:q>s MV do not relate to a single general inventive concept under PCT Rule 13.1 because, under POT 
Rule 13.2, they lack die same or conesponding special technical features for the following reasons: 

The q>ecial technical feature of Gioap I is contacting a protein witii an antibody in the presence of a bioactive agent to mhibit die 
binding of the protein to die antOxxfy. 

The special technical feature of Groi^ n is contacting a protein with an agent diat neutralizes die effect of die protem. 

The special technical feature of Groiq> in is exposing a tissue to an antibody conjugated to a therapeutic moiety. 

The special technical feature of Groiq) IV is administering to an individual an antibody conjugated to a therq)oitic moiety. 

Therefore, Groups MV do not share the same or correqKmding special technical feature so as to form a single general inventive 
concq>t 
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